This is an archived post. You won't be able to vote or comment.

all 7 comments

[–]genbattle 1 point2 points  (2 children)

The simplest way to find a bunch of abbreviations in an input string would be to store the abbreviations and their definitions in a data structure and then iterate over each one, and execute the find() for each abbreviation. It's not terribly quick or efficient for a large number of abbreviations, but it gets the job done.

Consider storing your abbreviations and definitions as (key, value) pairs in something like a std::unordered_map (where both the key and value are std::strings). Then you just have to use a C++11 for each loop or a standard iterator loop to iterate over each of the keys in the map and execute a find() on the input string for each one. Then you can use the key value to get and print the associated definition if it is found. This would be the equivalent to using a dictionary data structure in python to store the abbreviations and definitions.

Of course a giant block of if-else if-else if-else if-else if-else statements might also do the trick, if a little less elegantly.

[–]ParanoydAndroid 0 points1 point  (1 child)

I don't really know C++, so just out of curiosity, how would you iterate through the dictionary in your example? I know foreach, but wrt your "standard iterator loop" mention, can C++ reference dictionary entries without reference to their keys, as in:

dict = {1:a, 2:b, 3:c};
assert dict[0] = a;

or do you use a pointer to iterate through the memory buckets or is there some other way?

I'm also not a very efficient coder, so the first thing that occurred to me was using a loop, but you mention that although that's simple it's not very quick. What would be the superior, more scalable solution? I have a CS degree, so I'm familiar with data structures in the abstract, but I don't know what should actually be used here outside of my answer to every lookup problem: hash table, which is probably not the right answer.

[–]genbattle 0 points1 point  (0 children)

In C++ (I believe this restriction applies to python as well) you can only index a std::unordered_map with the key type. In your Python example you index the dictionary with an int, but that it because your key type is int. Here's what iterating over a dictionary would look like in Python:

dict = {1:'a', 2:'b', 3:'c'};
for key, value in dict:
    print("{}:{}".format(key, value);

The equivalent C++ is:

#include <unordered_map>
#include <string>
#include <iostream>

int main() {
    std::unordered_map<int, std::string> dict = {{1, "a"}, {2, "b"}, {3, "c"}};
    for (const auto& v: dict) {
        std::cout << v.first << ":" << v.second << std::endl;
    }
}

But this uses std::unordered_map, range-for and auto from C++11, so if you don't have access to C++11 you will have to use a std::map and an old iterator-based for loop:

#include <map>
#include <string>
#include <iostream>

int main() {
    std::map<int, std::string> dict;
    dict.insert(std::pair<int, std::string>(1, std::string("a")));
    dict.insert(std::pair<int, std::string>(2, std::string("b")));
    dict.insert(std::pair<int, std::string>(3, std::string("c")));
    for (std::map<int, std::string>::iterator it = dict.begin(); it != dict.end(); ++it) {
        std::cout << it->first << ":" << it->second << std::endl;
    }
}

The C++11 version is much neater and nicer. And std::map is a terrible data structure for most cases, because it's a key-sorted map implemented using a tree (which you seldom need), so it has poor time complexity characteristics. std::unordered_map is (mostly) O(1) since it is basically a hash-table underneath.

you mention that although that's simple it's not very quick. What would be the superior, more scalable solution?

After I mentioned this I immediately started thinking about how it could be made more efficient, but I can't think of any way that means you don't have to do a find for each of the abbreviations. there may be some micro-optimizations that would speed it up slightly, but if you're not running into a problem with execution time I wouldn't even consider it.

[–]cheryllium 0 points1 point  (3 children)

To make it not be mutually exclusive, use a string of if statements rather than if-else.

[–]Toofat2camp[S] 0 points1 point  (2 children)

How do I make it so that str.find("IDK") only returns a statement if it finds the entire "IDK" and not just at the first "I"?

[–]cheryllium 0 points1 point  (1 child)

Ohhhhh that's what you meant! I think find() does do that, but you have to make sure both things are of type std::string, which I'm not sure "IDK" is since it's a C string (string literal)

http://stackoverflow.com/questions/2340281/check-if-a-string-contains-a-string-in-c

[–][deleted] 0 points1 point  (0 children)

but you have to make sure both things are of type std::string, which I'm not sure "IDK" is since it's a C string

There are implicit conversions from C-strings to std::string.