# Solved: csv file splitting and unique names



## amstro (Feb 7, 2013)

Hi friends, I need help with editing a csv file. I have multiple rows in a file and I want each row to be transfered to a separate csv file. For ex, if my csv file is as below
crj,ckjck,dbnch,dnj,ch
152,15,15,487
abc,def,ghi,jkl,abc,def,ghi,jkl
........ and so on

I want n files created where n is the number of lines in my csv files. Each csv file will have exactly one line from the source csv file. I found a code elsewhere in this forum (http://forums.techguy.org/software-development/1023949-solved-split-100000-line-csv.html). It does exactly what I need with one exception. It generates 5000 lines into one csv file, while I need a single line. This is an easy fix. I change the variable 5000 to 1 and I got the results. Worked like magic. However, I have another issue; the file name. It uses a standard file name and increments it by 1 and so all file names have the same file name with a number at the end that keeps incrementing. I cannot use this naming convention. I want each file to use the first word from that line as the file name. So in my case. The first file will have name crj.csv, the 2nd will have name 152.csv, the third abc.csv and so on.

I am not a programming guy and I need help with this task. I am very grateful for any help in advance. Thank you so much.


----------



## Rollin_Again (Sep 4, 2003)

Post your existing macro code please.

Rollin


----------



## amstro (Feb 7, 2013)

```
@echo off setlocal ENABLEDELAYEDEXPANSION REM Edit this value to change the name of the file that needs splitting. Include the extension. SET BFN=BigFile.csv REM Edit this value to change the number of lines per file. SET LPF=5000 REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list. SET SFN=SplitFile REM Do not change beyond this line. SET SFX=%BFN:~-3% SET /A LineNum=0 SET /A FileNum=1 For /F "delims==" %%l in (%BFN%) Do ( SET /A LineNum+=1 echo %%l >> %SFN%!FileNum!.%SFX% if !LineNum! EQU !LPF! ( SET /A LineNum=0 SET /A FileNum+=1 ) ) endlocal Pause
```
Thanks again!


----------



## Squashman (Apr 4, 2003)

How about putting some CR\LF in there.

```
@echo off 
setlocal ENABLEDELAYEDEXPANSION 

REM Edit this value to change the name of the file that needs splitting. Include the extension. 
SET BFN=BigFile.csv 

REM Edit this value to change the number of lines per file. 
SET LPF=5000 

REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list. 
SET SFN=SplitFile 

REM Do not change beyond this line. 
SET SFX=%BFN:~-3% 
SET /A LineNum=0 
SET /A FileNum=1 
For /F "delims==" %%l in (%BFN%) Do ( 
	SET /A LineNum+=1 
	echo %%l >> %SFN%!FileNum!.%SFX% 
	if !LineNum! EQU !LPF! (
		SET /A LineNum=0 
		SET /A FileNum+=1
	)
)
endlocal
Pause
```


----------



## Squashman (Apr 4, 2003)

Can you guarantee that the first delimited column will never have the same information otherwise you will have files that will overwrite each other.

Don't even need most of this code if you are doing 1 line per output file. It is a much smaller and less complicated batch file as long as you don't have to keep track of possible file overwrites.


----------



## foxidrive (Oct 20, 2012)

This should work. If your file.csv has a long filename with spaces etc then rename it for this.


```
@echo off
for /f "delims=" %%a in (file.csv) do (
for /f "delims=," %%b in ("%%a") do (
> "%%b.csv" echo %%a
)
)
```
If doesn't check for existing files, as Squashman has mentioned.


----------



## amstro (Feb 7, 2013)

Thanks for the reply Squashman. Yes, I can guarantee that the first word will never be repeated in any of the lines. I will try the code and update back the result.


----------



## amstro (Feb 7, 2013)

Thank you foxidrive. I will check and update back with the result. Thanks everyone for helping me out.


----------



## amstro (Feb 7, 2013)

Foxidrive, this worked like magic. I get exactly what I needed. I have to ask one more help. How can I make it so that the output csv file has the results in a column fashion and not in a row. For ex, Genius.csv file has the result as Genius,talent,smart. How can I edit your batch file so that the result is
Genius
talent
smart
So basically I will have a big csv file and then once I run the batch file, multiple csv files are created with the first word from each line as the file name and each single line as content of the resulting files and also the resulting files have the data in column format and not in a line

Thanks a billion for your help


----------



## foxidrive (Oct 20, 2012)

This uses a second batch file called batch2.bat (see below).
It replaces every comma with a newline sequence

*) save the top batch file
*) save the second batch file and call it batch2.bat
*) put them both in the same folder with file.csv
*) run the first batch file.

See how it goes.


```
@echo off
for /f "delims=" %%a in (file.csv) do (
for /f "delims=," %%b in ("%%a") do (
> "a.tmp" echo %%a
call batch2.bat
del "a.tmp"
ren "b.tmp" "%%b.csv"
)
)
```
batch2.bat


```
@set @JScript=1/*
@echo off
setlocal
cscript //nologo //E:JScript "%~f0"
goto :eof
*/
// JScript
//
var ForReading= 1
var ForWriting = 2
var fso = new ActiveXObject("Scripting.FileSystemObject");
// paths must have doubled backslashes if used
// var input = fso.OpenTextFile("c:\\temp\\filein.txt", ForReading)
// var output = fso.OpenTextFile("c:\\temp\\fileout.txt", ForWriting, true)
var input = fso.OpenTextFile("a.tmp", ForReading)
var output = fso.OpenTextFile("b.tmp", ForWriting, true)
var data = input.Readall()
data = data.replace(/\x2c/gi, "\x0D\x0A")
output.Write(data)
input.Close()
output.Close()
```


----------



## Squashman (Apr 4, 2003)

You could in theory do this all with a pure batch sub-routine by calling the sub-routine with a call statement. The commas will work as the delimiter to the sub routine and you can reference them as %1 etc. You could use the shift statement to continue processing each argument passed to the sub routine and you can keep checking if %1 is not nul to keep processing the input.


----------



## foxidrive (Oct 20, 2012)

Yes, Squashman, that's feasible. It does depend on the file content however because poison characters have to have special handling.

The method I used is impervious to any poison characters, but 
a possible flaw in my method is that for an entry like this: 
one,two,"three pears, and four boxes",five
it will also split after the 'three pears' where it is one double quoted entry.


----------



## amstro (Feb 7, 2013)

Foxidrive, that is awesome! This worked exactly like I wanted it to. I am soo so grateful for your help. Considering the amount of work that I would have to do without these batch files, I would have stayed in my lab straight for a whole month. Thanks a million. I do have one more task at hand. But I will give it a shot based on your batch files and report back. Thanks again for all your help.


----------



## amstro (Feb 7, 2013)

Foxidrive, I really need your help again. I have this file which is just one big long statement. Imagine a text file about 3 MB with all kinds of text and special characters and everything (it almost take 20 to 30 secs for notepad to open it). The guy never hit the enter key and so there is just one big line. I have to separate this long statement into several statements. A statement starts at the first character in the file and ends at a semicolon. The next line starts after the semicolon and ends at the next semi-colon, and then the next line starts and ends at the next semicolon and so on and on and finally the last line ends at the last semicolon. I tried using your 2 batch files but I get several different files 
I just need one result file which has all the content from the original file but there should be several statements as opposed to the original one statement. Also note, that it is a .txt file
I hope you can help me out again and I promise not to bother you again (for a whole month). Thank you so much.


----------



## foxidrive (Oct 20, 2012)

Try this:

call it split.bat and launch it like this: split "filein.txt" "fileout.txt"


```
@echo off
 >_.vbs echo set regex=new regexp
>>_.vbs echo regex.global=true
>>_.vbs echo regEx.IgnoreCase=True            
>>_.vbs echo regex.pattern=";"
>>_.vbs echo wscript.stdOut.write regex.replace(wscript.stdin.readall,";"+vbCRLF)
cscript /nologo _.vbs <"%~1" >"%~2"
del _.vbs
```


----------



## amstro (Feb 7, 2013)

Amazing! Worked perfectly again! Thank you soo much Foxidrive. U rock!  I wont trouble you for a whole month now


----------

