# Solved: CR, LF, newline, ASCII 13 and 10, etc.



## andynic (May 25, 2007)

This thread is about handling newlines in differing operating systems and how it can affect a webpage.

My questions are these:
1. Both Mac OS X and Windows XP seem to interpret the ENTER key as ASCII 13 followed by 10. Is this correct?
2. Is this true for newer versions of Windows?
3. As I understand it, the "common" ways of interpreting the ENTER key by OSes is 
either ASCII 13 followed by 10,
or ASCII 13 alone
or ASCII 10 alone.
Is this correct? Are these the only ones, or are there other intrepations/combinations "out there"?

The reason for (context of) the quesitons:

A browser based application uses a php scipt to present a website page by selecting data from 
a mysql database.

The webpage that maintains the database allows the webmaster to enter and change data for the web site.

The DB maintenance webpage uses a to capture the data that gets stored in the 
DB column NEWS_ITEM, datatype text.

If the webmaster, for example, enters in the DB maintenance page,
THIS IS A NEW NEWS ITEM
BLAH BLAH BLAH
THE IS THE END OF THE NEWS ITEM.

The data that gets stored in column NEWS_ITEM contains ASCII 13 followed by ASCII 10 at the end of 
each line. It is the data taken from the after it has been passed through 
the php function htmlentity.
For example:
$dataFromTextarea = $_POST['nameOfTextarea'];
$dataFromTextarea = htmlentity($dataFromTextarea, ENT_COMPAT).
Then $dataFromTextarea is inserted into column NEWS_ITEM.

The php script that presents the website selects the data from the DB.
After the script selects NEWS_ITEM, it converts all ASCII 13s followed by 10 to 
tags:
$NEWS_ITEM = str_replace(chr(13) . chr(10), "
", $NEWS_ITEM);
$NEWS_ITEM = str_replace(chr(13), "
", $NEWS_ITEM);
$NEWS_ITEM = str_replace(chr(10), "
", $NEWS_ITEM);
and then presents it in a

tag:

$NEWS_ITEM

.

In the development and testing environments, this seems to work as one would wish.

I am concerned about website viewers using OSes other than Mac OS X or Windows XP and browsers other
than Firefox.

Thanks for your help.
Andynic


----------



## Ent (Apr 11, 2009)

The only real way to know this for sure is to test it.
Many of the linux or unix OSes are free, and you can get a free virtual machine host (I use VirtualBox) to install them in to avoid needing to multiboot. Insofar as you're worried, get a selection of browsers and operating systems and see whether they work.
It certainly looks to me like that approach would do fine.

The only thing that would be a little annoying is that the code would be much less readable. I'd certainly prefer HTML that has both 
and a newline character, so that it displays as


```
Good morning World<br />
I hope that you're feeling OK<br />
If something's the matter, do let me know.<br />
Though I can't promise to solve all the problems in the world.<br />
```
Instead of

```
Good morning World<br />I hope that you're feeling OK<br />If something's the matter, do let me know.<br />Though I can't promise to solve all the problems in the world.<br />
```


----------



## andynic (May 25, 2007)

Hi Ent,
Thanks for the reply. I will do as you suggest -- both re. OSes and 
followed by \n (which I normally do!) for the same reason you give.
Andynic.


----------



## allnodcoms (Jun 30, 2007)

Andy,

HTML is designed as a cross platform scripting language, and 
is interpreted by the browser based on the platform. The line feed (in Josiah's example) is treated as whitespace and ignored anyway.

AFAIK, the only time that the CR / LF issue is relevant is when sending mail via NIX / Windoze servers. For straight forward text on a page you can trust the browser to do the right thing...

Danny


----------



## andynic (May 25, 2007)

Hi Danny,

I think we might be speaking at cros-purposes.

I interpret what Joshiah is saying about \n to have to do with "pretty printing" the resulting html file that the php script generates when it puts up the website, i.e. what you would see as the result of clicking "view source" in the webpage.

Including the \n does indeed effect "pretty printing", which can aid debugging.

Instead of something like this in the html (without the \n):
THIS IS A NEWS ITEM
BLAH BLAH BLAH
END OF NEWS ITEM
one sees 
THIS IS A NEWS ITEM
BLAH BLAH BLAH
END OF NEWS ITEM

What the website viewer sees in his/her browser in either case is:
THIS IS A NEWS ITEM
BLAH BLAH BLAH
END OF NEWS ITEM
Andynic


----------



## andynic (May 25, 2007)

Based on this information which I found at http://en.wikipedia.org/wiki/Newline

"The Unicode standard defines a large number of characters that conforming applications should recognize as line terminators:[3]
LF: Line Feed, U+000A
VT: Vertical Tab, U+000B
FF: Form Feed, U+000C
CR: Carriage Return, U+000D
CR+LF: CR (U+000D) followed by LF (U+000A)
NEL: Next Line, U+0085
LS: Line Separator, U+2028
PS: Paragraph Separator, U+2029"

I now use the following function on a string after retrieving it from the DB and before putting it up in a webpage via a php script:
function cvtNewlineToBr($frmFct, &$str)
{
$fctNm = "cvtNewlineToBr";

$str = str_replace(chr(13) . chr(10), "
\n", $str); // CR LF
$str = str_replace(chr(10) . chr(13), "
\n", $str); // LF CR
$str = str_replace(chr(13), "
\n", $str); // CR
// $str = str_replace(chr(10), "
\n", $str); // LF Not used because \n is chr(10).
$str = str_replace(chr(11), "
\n", $str); // VT
$str = str_replace(chr(12), "
\n", $str); // FF	
} // End cvtNewlineToBr

e.g.
$str = "some long string containing 0 to an unspecified number of newlines";
cvtNewlineBr("calling function name", $str);
echo "

$str

\n";

Andynic


----------



## allnodcoms (Jun 30, 2007)

Glad it's sorted Andy, and Josiah is spot on (as usual ), the newline does tidy up the source, and it doesn't affect the final render of the page. Only difference is file size - but what's a couple of bytes between friends?

Danny


----------



## Ent (Apr 11, 2009)

andynic said:


> // $str = str_replace(chr(10), "
> \n", $str); // LF Not used because \n is chr(10).


I don't think that should cause problems at all, but if you were to find some old system that only uses \n a possible solution would be the following:

Replace each of the possible newline characters with 
first.
Replace 
with 
\n at the end.

Also for reasons of efficiency it might be better to do this manipulation during data entry instead of every time the page is loaded. i.e. store all the correct 
things in the database.


----------

