# PHP paragraph function



## dudeking (Feb 7, 2007)

Hey I have made a very simple cms for a client to be able to update text on the site.. the only thing he needs to be able to add is paragraphs. So I have wrote this script that takes an input from a text area.


```
function addParagraphs($string)
{
	//Remove HTML
	$string = strip_tags($string);

	//Remove Key Words
	$string = str_replace("{Line_Break}","[Line_Break]",$string);

	//Strip Line Breaks
	$string = str_replace("\r\n","\n",$string);
	$string = str_replace("\r","\n",$string);
	$string = str_replace("\n\n","\n",$string);

	//Remove Multiple Spaces
	$string = str_replace("  "," ",$string);

	//Add Tag Where break is Needed
	$string = str_replace("\n","{Line_Break}",$string);

	//Replace Tag With HTML
	$string = str_replace("{Line_Break}","

",$string);
	$string = '

'.$string.'

';

	//Return Formatted String
	return $string;
}
```
It works in my closed testing but is there anything that could crash it or create a vulnerability. The data from here goes into a mysql database.

Thanks for any advice


----------



## tomdkat (May 6, 2006)

Maybe try throwing random garbage at it a see what happens. Also, try embedding HTML tags in the text being passed to the function to make sure that's cool.

What if someone entered text that contained an HTML link or reference to another site which tried to load something when referenced? That sort of thing. 

Peace...


----------



## dudeking (Feb 7, 2007)

Okay after lots of testing I found that in some cases empty paragraphs were created, e.g.

So I have added abit to remove that and after your comment about HTML I have added support for bold text but I don't want links to be added into text. On testing when a <a> tag is entered just the actual text remains, all the tag and link goes.

New script...


```
function addParagraphs($string)
{
	//Remove HTML
	$string = strip_tags($string);

	//Remove Key Words
	$string = str_replace("{Line_Break}","[Line_Break]",$string);

	//Strip Line Breaks
	$string = str_replace("\r\n","\n",$string);
	$string = str_replace("\r","\n",$string);
	$string = str_replace("\n\n","\n",$string);

	//Remove Multiple Spaces
	$string = str_replace("  "," ",$string);

	//Add Tag Where break is Needed
	$string = str_replace("\n","{Line_Break}",$string);

	//Replace Tag With HTML - Line Break
	$string = str_replace("{Line_Break}","

",$string);
	$string = '

'.$string.'

';

	//Replace Tag With HTML - Bold
	$string = str_replace("{bold}",'',$string);
	$string = str_replace("{/bold}",'',$string);

	//Remove Useless Formatting
	$string = str_replace('','',$string);
	$string = str_replace('

','',$string);

	//Return Formatted String
	return $string;
}
```


----------



## tomdkat (May 6, 2006)

What if they enter a paragraph tag in the text they enter, like this:


> This is a test of creating
> 
> a paragraph within a paragraph


and of course, the permutations:


> This is a test of creating a
> 
> paragraph with a paragraph.
> This is a test of creating a paragraph
> ...


Also, will you allow other HTML tags?


> Check out my swell image!


Or JavaScript:


> This is some text
> 
> to hack the site!
> All your base are belong to us!


And so on. 

Peace...


----------



## dudeking (Feb 7, 2007)

Beautiful  Just tested it with all your examples and it just removes the html leaving the paragraphing as manually formatted and adds no images etc...

Thanks


----------



## tomdkat (May 6, 2006)

Excellent! I suggest giving others some time to think of things we've missed so far.

One thing that comes to my mind is embedding PHP code, like this:


```
This is a <p>paragraph <?php call some function to do something obscure ?></p>
This is another <b>paragraph
<?php call some function to do something obscure ?>
that should be in bold
</b>
</p>
```
You get the idea. 

Peace...


----------



## dudeking (Feb 7, 2007)

Arrr I hadn't thought of that....I'll give it a go now.
Thanks


----------



## tomdkat (May 6, 2006)

I'm just trying to think "outside the box". 

I know now how bulletproof you'll need that to be but I would also look at maybe filtering attempts to embed objects as well with the "" tag, including the "" tag.

EDIT: The PHP example might be a hole if someone can figure how some kind of SQL injection exploit using existing database connections or something I don't know how to do. 

Peace...


----------



## dudeking (Feb 7, 2007)

I can't actually find anything strip_tags dosnt remove. Are there any known instances that can stop this working.. or any tags that are allowed through?


----------



## tomdkat (May 6, 2006)

I don't know. So, I'm thinking up as much random stuff as I can to make sure something unwanted doesn't creep through. 

Peace...


----------



## dudeking (Feb 7, 2007)

My sites pretty secure against SQL injection. I've the server setup to add \ to all quotes in $_POST data. So unless they can edit my .ini file I don't think thats too much of an issue. But im sure theres people out there who can.


----------



## colinsp (Sep 5, 2007)

Not sure about the vulnerabilities but what about giving them a wysiwyg editor that may help as there is a lot of code checking built in.

A couple that spring to mind are TinyMCE and OpenWYSIWYG.


----------



## dudeking (Feb 7, 2007)

I have tried that, but most of the lack support for safari. And its kind of surpluses to requirements... It really just needs to be plain text, then its styled to fit in with the rest of the site. The only functionality I needed was paragraphs (needed to be detected and have

tags placed in, and a bold tag for if people want to be really direct.

I'm just concerned about having this box letting things be added to the database. How does PHP handle weird ASCII codes.. like these hearts and things people put on social networks. Can I force UTF8?


----------



## tomdkat (May 6, 2006)

dudeking said:


> My sites pretty secure against SQL injection. I've the server setup to add \ to all quotes in $_POST data. So unless they can edit my .ini file I don't think thats too much of an issue. But im sure theres people out there who can.


Remember to think _outside_ the box. As soon as you open up the database to receiving input from a user, you're opening yourself up for who knows what to be thrown at the database. 



dudeking said:


> I'm just concerned about having this box letting things be added to the database. How does PHP handle weird ASCII codes.. like these hearts and things people put on social networks. Can I force UTF8?


Great question! I know some malicious JavaScript is obfuscated and I wonder if a malicious PHP exploit could be obfuscated in a similar fashion, and include binary data that would be interpreted as ASCII control codes (the symbols you mention). Since you're using a HTML form you might be able to control the character encoding through one of the attributes of the tag.

Additionally, we need to consider the exposure of the input fields. Will random people be submitting text or will one or two specific people you know be the only ones?

Peace...


----------



## dudeking (Feb 7, 2007)

In this instance purely the client and any members of staff will have access. But I do need a commenting system for a bands website im working on, so eventually it will be used for that too.

Thing is controlling encoding from the tag can easily be changed. I've just done some googling and theres a function called utf8_encode(). So I'm guessing that will make sure everything is saved in the database correctly.


----------



## tomdkat (May 6, 2006)

Sounds like a plan! :up:

I've provided all the input I can think of so I'll bow out at this point. 

Good luck!

Peace...


----------



## dudeking (Feb 7, 2007)

Thank you very much for you help


----------



## tomdkat (May 6, 2006)

I actually thought of something else. I'm an idiot for not thinking of this sooner. 

PHP has a limit on the amount of data it can receive from a HTML form using the "POST" method. Read on the "post_max_size" PHP variable. With this in mind, you should test putting LARGE amounts of data in the form field where the user will be editing the text. Load word processing documents and copy/paste the contents into the HTML form field and see what happens. Try pasting tens of kilobytes and maybe get into the megabytes range to see how PHP behaves, how MySQL behaves, and how your CMS behaves.

Peace...


----------



## dudeking (Feb 7, 2007)

Oh I hadnt even realised that was an issue. I would hope that the script would just crash but I will have to check it out. Thinking of that, the script is obviously expecting a string. What if the html form was edited and a file was posted to the server where a string was expected. What would happen? A user can easily change html form elements. They could even change it to a check box or something coudnt they.... What would happen?


----------



## tomdkat (May 6, 2006)

The memory issue came to mind because I had remembered PHP had memory restrictions (albeit configurable) on scripts it runs and I then vaguely remembered sometimes having issues with trying to submit "too much" data in a HTML form processed by a PHP script.

In any event, a crash might happen if too much data is submitted via your form but that's also what hackers sometimes rely upon. Send garbage data to cause an overflow, then put special instructions in the area now exposed due to the overflow.

Also, you'll need to know how MySQL will deal with this kind of thing. If the "blob" makes it to PHP land ok, will the data be sound for insertion into the database?

If the HTML file wasn't edited but the URL of the form processing script was loaded directly, with the binary (or other) data sent along as payload? I don't know what would happen either but in the interest of testing your code, it might be worth checking out. 

Peace...


----------

