Articles

PHP configuration statistics

  • Ecrit par Damien Seguy
  • vendredi 03 novembre 2006
Image pour le titre du contenu

Ce document est aussi disponible en français fr 

Everyone knows the famous PHP phpinfo(), which provide the programmer with invaluable information about his server configuration and set up. This is a useful tool as soon as one get a new server, and it is also a tool to talk with any administrator.

Yet, after usage, it is usually recommended to remove it, or to restrict its access to few people. Indeed, phpinfo may be dangerous by itself : in other times, it was even flawed with XSS injections. Even when secured, phpinfo() publish information about your architecture, and it is always recommended to keep it from privy eyes.

Sadly enough, the common habit to set up a phpinfo page on every web site is now so widely spread that even search engines are starting to pick them up : there are literally thousands of phpinfo indexed on Yahoo and Google. Just hit a search with the words 'phpinfo()' 'GoogleBot' and  "Zend Scripting Language Engine" on google. See also Ilia's article about protecting your phpinfo : en Reliably locating phpinfo (894 visites))

On the bright side, collecting all those phpinfo on Internet will provide us with a wealth of information about PHP configuration directives' values for a broard range of web sites in the world. Once we have the URL, we just have to download the phpinfo() file. Later, we'll screen scrape it, and store the data, and process lots of stats. This is just what I did.

I gathered 12,000 phpinfo() on Internet, reduced them to 11,048 useful one for this paper. 11,000 is a pretty low number, compared to the millions of PHP web sites in the world : this is truly a drop in the ocean. When comparing it to the crowds of phpinfo() on Yahoo! And Google, it still represented about 1% of the total population : this is not so bad.

How much interesting information will we gather from this corpus? Will it be representative of the common practices? Everyone will agree that publishing phpinfo() in a search engine is pretty dumb, even if it happens by chance. This is definitely not a safe nor a best practice. Actually, I'll recommand you check google to see if it hasn't harvested your phpinfo on your web site : use keywords 'phpinfo site:yoursite.com'. You may get a nasty surprise. So, just for you to know, I couldn't spot any high profile web site in the corpus we're going to use : in short, I didn't get any phpinfo from Google, Yahoo, NASA, Lockheed or UNO.

On the other hand, 11,000 is already a large population. Moreover, after collecting those phpinfo, I calculated the correlation between the version numbers from the phpinfo() population, compared to the distributions in PHP Version stats ; correlation was 87% which is pretty good. All in all, this population seems to be pretty representative.

So, to finish this lengthy introduction, those stats are serious and representative, but may be not too precises. There are interesting data to pull from them.

Last word before the pies : phpinfo hold lots of values, and it was tempting to produces even more stats and cross analysis. I'll have to publish them in several articles. If you are interested in figures that could help cast the light of Truth or give better understand of the situation, feel free to mention them in the comments or mail them to me : I promise I'll process all sane request.

Also, I'm interested in extending this corpus. If you have phpinfo() available, feel free to mail them to me, as URL, attachments or plain text. I'll add them to the current corpus.

Technical environment

Operating System

Without surprise, Linux is the most often host for PHP web site. FreeBSD, Windows and Mac are trailing far behind. Other OS (in declining order): AIX, OpenBSD, OSF1, NetBSD, IRIX64, HP-UX, BSD/OS, Debian, NetWare, UnixWare, IRIX, VMS, SCO_SV, AmigaOS, QNX, BeOS.

Apache is leading, as usual. IIS doesn't appear as is in phpinfo: it is hidden as ISAPI, CGI or FastCGI. Apache 2 market share seems to be a tad high, compared to Apache 1.

Compilation date is an interesting criteria as to the truth in those figures. A large majority of the servers did recompile PHP this or last year. It seems that the community is taking care of their servers.

Here are some monthly stats of compilation. Surprisingly, PHP administrator 'like' to compile PHP in August and September. Actually, they seem to take some rest the following month, as October is the least PHP compiling month.

Famous directives

Some PHP directives are pretty popular, those days. Let's take a look at them.

Register_globals

The official recommendation is : unset it! How is this recommendation taken? More than half the configuration still accept it. Explanations are surely several, and I'm sure you have one. Actually, this global figure hide some truth, as levels are very different from one version to the other. We'll see that later, with other figures.

Another unpopular directive : safe_mode. Surprisingly, it has a low usage, which contradict the shared feeling that PHP is for shared hosts. Actually, Nexen Services (my hosting parent company) got rid of it some years ago, and rely on open_basedir. There seems to be a similar trend among hosters. Of course, dedicated hosting doesn't use it. Recommendation is : don't use it.

Allow_url_fopen allows file functions and streams (fopen, file, file_get_contents, etc.) to access remote files, such as web pages or FTP files. This is also a security concern, as it allows for PHP code injections. Yet, 90% of web sites allow it.

Magic_quotes was a security feature, and is now just a shameful option : it will die in PHP 6. Yet, it sems that lots of PHP specialists still use it to protect their SQL.

The other magic_quotes, the runtime version, are simply totally lost.

Max_execution_time is certainly the first directive everyone discovers. Almost no web server did set a lower value than the default and generous 30s. In fact, trend is to make this value higher, and much higher sometimes. This is usually to accommodate administration scripts, that requires long running times.

Memory_limit is the one stat that surprised me the most. Before reading the results, I believed that everyone was using it, more or less. Actually, most of the servers do not have it compiled!

This has also the most scattered values. The default value of 8Mb seems to be insufficient quite often. Galeries and graphic tools are requiring larger shares of memory. PHP group already acknowledged this fact, raising the default value to 16Mb in PHP 5.2. This may not sufficient, as the second most commun configuration is 32 Mb. We'll see.

Sécurity

Here are some security choices in terms of directives.

Open_basedir is your best bet as security directive, yet is seems to be unknown to most of PHP programmers. I didn't detailed the values.

Disable_functions allows fine tuning of specific functionalities. It is usually used through safe_mode, and is not too popular.

Display_errors is the directive that allows display of error in the produced content. This is very handy during development but is a no-no in production.

enable_dl allows the dynamic inclusion of functionalities in PHP. Generaly speaking, this is useful for programmers, so that they do not reboot the whole web server each time they test PHP. Production wise, this is bad and slow. Besides, with a show PHP script and an uploaded library, your whole server may be at risk.

Error_reporting configure the reporting level. Higher is the value, more thoroughly will PHP report any error or notice. I used to recommend setting error_ reporting to 0, but it is much better to set error_displays to 0ff, and keep error_reporting at a decent level. Values higher than 2000 are OK for development, but may be difficult to use in production.

The error log should be the only one receiving production errors. You'll be able to use it later to check your application's health, or assess a disaster.

Expose_php display PHP version number in the HTTP headers, and add some easter eggs. Security wise, this is not recommended, as it is always better to conceal information from pirates. On the other hand, I need those values to produce my monthly stats, so, I cannot make a decision. Thank you to all of you who keep this information on.

Other directives

File_upload is always available!

Once files are uploaded, this directive will prevent the biggest files to clutter the server. Default values is 2 Mb, which will be OK for most galleries. It is surprising to see the number of servers using 100Mb as value...

< ? instead of < ?php : there are still many projects and application that use this tag. It is recommended to use  < ?php, but it seems that everyone accept it.

Using PHP while believing it is ASP? This is the choice of 4%...

Y2K compatibility is OK for 75% of the web servers.

This is the number of significant figures when displaying floats.

Conclusion

Configuration values hold surprises, or not. After reading those values, we may even wonder if some functionalities did require a directive or not...

As usually, default values from the distribution are the most commonly used values : it shows how much trust PHP programmers have in the PHP group. Or, it may also show that too few people read the php.ini file, and understand it.

The above charter pies give a global overview of the community, but they don't give any trends. To get them, we'd need to cross those figures with versions... that will be for next time.

< Précédent   Suivant >

Commentaires

Vous pouvez ajouter votre commentaire!


Vous devez vous connecter pour commenter