# c++ string split to array???



## redivivus

Yo yo. Is there any way in C++ to mimic what string.split() would accomplish in javascript!!? I desire to split a large string into many elements of an array. I cannot find out if there is any solution to this.  Seems like all the examples split the string manually which is fine if you dont have 1000 entries to split up lol. 

Any ideas?


----------



## dquigley

Yup use a dynamic "array". For some samples, Google this "string.split vector c++" 

Best,
Dan


----------



## redivivus

Okay i did some research. This is my first C++ app and im pretty lost with this stuff.

Im trying to make this work, but its not working. Can you help me? :up:



Code:


vector<string> split(const string& value, char separator, int LENG)
{
    vector<string> result;
    for (int i=1;i<=LENG;i++)
    {
        string::size_type pos = value.find_first_of(separator);
        value = value.substr(0, pos) + "%" + value.substr(pos+1);
        result.push_back(value.substr(0, pos));
    }
    return result;
}

I had it working but then i needed to replace the separator every time so it wouldnt get the first one over and over. I think the problem is trying to edit the const string?? 

Also do you think there would be a more efficient way than running the loop thousands of times?


----------



## GCDude

you dont need to check every character, find_first_of can be given a param of where to start, so u just need an interation for each seperator. Something like this should work.



Code:


vector<string> split(const string& strValue, char separator)
{
    vector<string> vecstrResult;
    int startpos=0;
    int endpos=0;

    endpos = strValue.find_first_of(separator, startpos);
    while (endpos != -1)
    {       
        vecstrResult.push_back(strValue.substr(startpos, endpos-startpos)); // add to vector
        startpos = endpos+1; //jump past sep
        endpos = strValue.find_first_of(separator, startpos); // find next
        if(endpos==-1)
        {
            //lastone, so no 2nd param required to go to end of string
            vecstrResult.push_back(strValue.substr(startpos));
        }
    }

    return vecstrResult;
}

Note this doesnt handle the case if there is no seperator. add your own case for that.


----------



## AGCurry

Why not just use strtok() ?


----------



## Shadow2531

Here's an example:



Code:


#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <vector>
using namespace std;

inline vector<string> split( const string& s, const string& f ) {
    vector<string> temp;
    if ( f.empty() ) {
        temp.push_back( s );
        return temp;
    }
    typedef string::const_iterator iter;
    const iter::difference_type f_size( distance( f.begin(), f.end() ) );
    iter i( s.begin() );
    for ( iter pos; ( pos = search( i , s.end(), f.begin(), f.end() ) ) != s.end(); ) {
        temp.push_back( string( i, pos ) );
        advance( pos, f_size );
        i = pos;
    }
    temp.push_back( string( i, s.end() ) );
    return temp;
}


int main() {
    const vector<string> test( split( "1,2,3,4,5,6,7,8,9", "," ) );
    copy( test.begin(), test.end(), ostream_iterator<string>(cout, "\n" ) );
}

/* g++ -Wall -Wextra this.cpp -o this -O3 -mtune=i686 -s */

It doesn't support the limit param like the JS split() does, but I assume you don't usually use that. Also, just like the JS split(), if 2 delimeters are right next to each other ( or at the beginnin/end ), a blank string is added to the array. If you don't want that, in the function, check the string to see if it's empty before pushing back().

Do note that the boost library already has exactly what you want and more.

http://www.boost.org/libs/tokenizer/char_separator.htm
http://www.boost.org/libs/tokenizer/index.html

STL's mingw distro comes with the boost library, so you can use it just like the stl headers. ( Some functions of the library require that you link with libraries that you must build. Boost regex and Boost filesystem are a few of them. There are more, but STL's distro comes with regex and filesystem already built so you can link with them. Boost tokenizer is header-only, so you just include the header and go for it. )


----------



## Shadow2531

AGCurry said:


> Why not just use strtok() ?


Are you saying something like this?



Code:


#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <iterator>
#include <cstring>
using namespace std;

int main() {
    char str[] = "1,2,3,4,5,6,7,8,9";
    char* tok = strtok (str, ",");
    vector<string> parts;
    while ( tok != NULL ) {
        parts.push_back( tok );
        tok = strtok (NULL, ",");
    }
    copy( parts.begin(), parts.end(), ostream_iterator<string>(cout, "\n") );
}

( Assuming that the target storage is an array of std::strings. )


----------



## redivivus

Thanks for the replies. I will try this again when i get home. =)

As for the last one, what does 'tok' stand for? Might help me understand it.  :up:


----------



## Shadow2531

redivivus said:


> As for the last one, what does 'tok' stand for? Might help me understand it.  :up:


str is short for string
tok is short for tokenizer
strtok is short for string tokenizer
tok in the last example is just the variable name I used to store the pointer strtok returns.


----------



## redivivus

Well... i gave it another shot. It just wasnt working though. I was trying to split the string "two three four five" up and it kept putting "8" as the first value.  

I ended up subbing in your code, GCDude, and it actually works. I honestly cannot see the difference between yours and the one i had (that was returning 8) except variable names. =( I must have had some problem lol...

When i changed it to your code it also fixed the problem i encountered when i tried to run the program with Dev C++ ... it was scrolling through screens of symbols and beeping for some reason. 

Thx all. =)


----------

