TIW: STL #1

2016-10-19 5 minute read

This post in a nutshell:

vector<vector<pair<int, map<int, set<int> > > > >
  nutshell(n, vector<pair<int, map<int, set<int> > > >(m));

If you know what that means (not that it MEANS anything), then you can skip this post. STL is the most basic and useful topic. With good practices, they can make your code much more concise and easier to debug.

Vector

Basic facts:

push_back() is O(1), pop_back() is O(1);
you can make 2D arrays with different lengths for each row;
initialize a multi-dimensional array in 1 line;
let vector take care of all the dynamic allocation and deallocation of memory you ever need;
avoid stack overflow at times.

Some initialization methods:

// Initialization of 1D vector with no elements:
vector<int> x;
 
// Initialization of 1D vector with 10 elements, default value 0:
vector<int> x(10);
 
// Initialization of 1D vector with 10 elements, default value -1:
vector<int> x(10, -1);
 
// Initialization of 2D vector of 10 rows x 20 columns, default value 0:
vector<vector<int> > x(10, vector<int>(20));
 
// Initialization of 2D vector of 10 rows x 20 columns, default value -1:
vector<vector<int> > x(10, vector<int>(20, -1));
 
// Initialization of a 1D vector of user defined values:
vector<int> x{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

Vectors can be used as stacks, with push_back(), back(), and pop_back().

vector<int> stack{1, 2, 3, 4, 5};
while (!stack.empty()) {
    cout << stack.back() << endl;
    stack.pop_back();
}
// output: 5 4 3 2 1

I will cover more usages of vectors, for example for BFS and DFS, when I get through more problems.

Set

Sets are very useful and powerful. Basic facts:

elements are unique;
insert, take smallest, take largest, find, erase any element are O(log(n)), so you can use it as a heap/priority queue;
elements are in sorted order;
binary search is built-in, which I will go through in a coming post;
inside the black box, sets are balanced binary search trees.

// Initializing an empty set
set<int> s;
// Initializing a set with elements from a vector
set<int> s(v.begin(), v.end());

Other operations:

// Finding if an element is in a set:
if (s.find(k) == s.end()) { // alternatively, if (s.count(k) == 0) {
    cout << "not in the set" << endl;
} else {
    cout << "found in the set" << endl;
}
// Inserting to a set:
s.insert(k);
// Finding the smallest and largest in a set:
if (!s.empty()) {
    cout << "smallest element: " << \*s.begin() << endl;
    cout << "largest element: " << \*s.rbegin() << endl;
}
// Erasing an element in the set:
s.erase(k);
s.erase(s.begin());
s.erase(\*s.begin());

Notice that there are two ways to erase an element: you can either pass in the iterator of the element, or pass in the actual element value. I will probably leave iterators to another post so this one is not super long.

When you do not need the elements sorted, you might want to consider using std::unordered_set instead of set, which is basically a hash table version of set. Insert, find, erase will become O(1), but space required is larger, time constant is larger which means it might hurt performance for small set sizes, and you will not be able to get the smallest element in O(log(n)).

Sometimes you do not want the elements to be unique, say you want to keep the height of each student in a class in a set. Then you would want std::multiset, which is basically set without the unique element requirement.

Map

Maps are like dictionaries, it maps one simple thing (a word) to another thing (its definition). Basic facts:

keys are unique;
insert, take smallest, take largest, find, erase any key are O(log(n));
keys are in sorted order;
binary search is built-in;
inside the black box, maps are balanced binary search trees, each node containing a second data (the value) that is not used for comparison.

Make a map:

map<int, int> m;

Put something in the map:

m[42] = 3;

Tip: you can access a non-existent key-value pair this way:

map<int, int> m;
m[0]++;

This way of access to a non-existent key-value pair will throw an exception, sometimes good for debugging, and otherwise equivalent to the above:

m.at(0)++;

Pair

I think pair is the most underrated STL data structure ever. It is not taught in classes usually (not in mine anyways) because it does not contain any special algorithm, but it is a great shorthand for simple data structures. Basic facts:

a pair sticks 2 things together;
it has a default comparator which compares the first guy, and if they tie, compares the second guy.

To make a pair:

pair<int, int> p0;
pair<int, int> p1 = make_pair(1, 2);
if (p0 < p1)
    cout << "0 < 1 confirmed" << endl;

Now that we have all the pieces, we can start sticking things one inside another. Let’s grab a Leetcode problem:

Valid Anagram

TLDR: given two strings with lowercase letters, determine if they are anagrams of each other.

Here’s one way to do it using vector, with f[i] counting the frequency of the (i+1)th alphabet:

bool isAnagram(string s, string t) {
    if (s.length() != t.length()) return false;
    vector<int> f(26);
    for (int i = 0; i < s.length(); i++) {
        f[s[i]-'a']++;
        f[t[i]-'a']--;
    }
    for (int i = 0; i < 26; i++)
        if (f[i] != 0) return false;
    return true;
}

Here’s another way to do it using map, same idea:

bool isAnagram(string s, string t) {
    if (s.length() != t.length()) return false;
    unordered_map<char, int> m;
    for (int i = 0; i < s.length(); i++) {
        m[s[i]]++;
        m[t[i]]--;
    }
    for (unordered_map<char, int>::iterator i = m.begin(); i != m.end(); i++)
        if (i->second != 0) return false;
    return true;
}

Yeah… That’s quite trivial. But we’ll get to the fun stuff after we have all the basics.

One final remark for STL data structures: put them in the scope where they belong. For example, if you need to create a vector 10 times, don’t make one vector as a global variable but rather create the vector within the for loop, so you won’t accidentally use garbage values left from the last iteration.