Saturday, December 18, 2010

Figuring Out Apache's Config File Structure

Figuring out the layout of Apache config files has always been confusing to me. I never quite know where I should put a given directive. Do I put it in the httpd.conf file? Or elsewhere? What about site or web-app specific config--stuff I can't or don't want to put in .htaccess? On my PC there appears to be about half-a-dozen little directories for putting different kinds of Apache configuration. Is this standard? What's the standard, best-practice for configuring Apache? Where do I put my stuff?

Debian & Ubuntu's Layout

Apparently there's not a whole lot that *is* standard. We can though learn a lot by looking a little at what Debian and Ubuntu do. This great article  lists how Debian and Ubuntu's flavors of Apache work. According to the article, Debian's default Apache config file is /etc/apache2/apache2.conf. From this file, files are included using the Include directive in  the following order:

# /etc/apache2/apache2.conf - pulls in additional
# configurations in this order:
Include /etc/apache2/mods-enabled/*.load
Include /etc/apache2/mods-enabled/*.conf
Include /etc/apache2/httpd.conf
Include /etc/apache2/ports.conf
Include /etc/apache2/conf.d/[^.#]*
Include /etc/apache2/sites-enabled/[^.#]*

Debian divides configs between those for Apache modules (mods-enabled directory), listen ports (ports.conf), VirtualHosts (sites-enabled) directory, and random other things (conf.d directory). The file httpd.conf is included for legacy reasons--it's the most common Apache root filename on all other distros.

Each file in "sites enabled" contains a VirtualHost directive. This directive allows configuration to be scoped to the host and/or port number used to access the site. Ports.conf holds both NameVirtualHost and Listen directives indicating what ports Apache should listen on. NameVirtualHost declares the VirtualHosts to be expected in the VirtualHost configurations later on. It says there will be a virtual host at localhost port 80. This is kind of like a "forward declaration" in C++ speak. Naming something without giving an explicit, full-fledged definition of the configuration you want to apply to that something.

"Mods-enabled" holds a .load file for each module containing a "LoadModule" directive-- telling Apache how to load the module. Each module may also have an optional .conf file for module-specific configuration. The dav_svn module on my box, for example, sets the SVNPath variable to the path of my SVN repository in its dav_svn.conf file.

Debian makes it easy to enable/disable modules and virtual hosts. Debian places all available modules and VirtualHost configs in the mods-available and sites-available locations respectively. These could be copied to the *-enabled directory directly, however Debian distros make it a little easier by providing non-standard a2enmod/a2dismod for enabling/disabling modules and a2ensite/a2dissite for enabling/disabling VirtualHosts. These commands place the specified configs will be placed in the corresponding *-enabled location. A restart of apache picks up the new module/site config.

When installing from a package, Debian puts config files specific to each piece of installed web app in the conf.d directory with the other miscellaneous config files. For example when I installed the phpmyadmin web app through synaptic, a phpmyadmin.conf was place in the conf.d directory. This config file uses the Alias directive to point the /phpmyadmin location at /usr/share/phpmyadmin. The file also disables/enables various php settings for phpmyadmin.

The Problem

I find Debian's layout to be a very tidy setup. Unfortunately, based on this article each OS install does something different. Each OS's install of Apache may use a different root configuration file. How this configuration file includes other paths/files also differs for each OS. Oh Joy. So your average Apache professional must learn the *right* place to put stuff for each distro? How does someone maintain a clean setup across multiple OS's without going nuts?

More specifically, consider the time when I want tight control of the server--I want to completely disable .htaccess files.  I want a single file with a very clearly defined, static Apache config for each installed web app (as is done with phpmyadmin). I want an obvious place to put that config on each OS when setting up the app. Furthermore, I may need some configuration that can't be put into an .htaccess file (like phpmyadmin's Alias directive). So where do I put my web app specific configuration? If I create some ostensibly platform-independent LAMP app, how could I write a script to ensure it gets installed on an OS-independent setup? Is this even feasible?

There is Some Hope

One thing that should be standard is that there is always one "root" server level file. Apache's own configuration lets us feel secure in this fact
The main configuration file is usually called httpd.conf. The location of this file is set at compile-time, but may be overridden with the -f command line flag. 
That root file may include others based on a given distribution's setup (as apache2.conf does for Debian). But a single, default, root Apache configuration file should always exists. We can hang our hat on that at least. So as I see it  the best way to place a web app specific config into Apache is to first try to find a directory for miscellaneous configurations (for Debian thats /etc/apache2/conf.d). Failing that, we'll have to do our best to directly include our configuration in the target OS's root configuration file.

It looks like the bottom line is you *can't* get around learning your distribution/OS's standard config file layout. You're gonna have to familiarize yourself with your OS's Apache install a little in order to play nice. Unfortunately the knowledge is not going to be very portable between OS's. But you can help yourself out by finding the root apache config file, seeing what it includes, and moving forward from there.

Doug being an idiot -- An example of checking the most obvious thing first...

Early this morning I set myself onto what should be the relatively simple task of installing a LAMP development environment in my Ubuntu virtualbox. I used aptitude to grab apache2, mysql, php5, etc. To test that I correctly had everything working (and because its useful in general) I grabbed phpmyadmin. Upon browsing to http::/localhost/phpmyadmin I noticed Apache was not running the index.php file, instead my browser downloaded the file. I must be forgetting to install something.

I initially suspected this problem to be related to not having the php5 Apache module installed (a separate package from the base php5 one). After installing/refreshing I still had the problem. Browsing to the IP address directly (http://<ipaddr>/phpmyadmin/) did not show the problem. phpmyadmin worked fine. Even browsing to http://127.0.0.1/localhost didn't show the problem. It all worked. Only http://localhost had the problem. WTF! I suspected a directive in the Apache config made the site available for web users but not through the localhost. I was sure there was something with the name "localhost" somewhere in the config file. After coming up with nothing, I tried changing "localhost" to "localhost2" in my hosts file. Chrome still showed the same bad behavior for http://localhost, even though it shouldn't even be able to contact the web server at all  now! WTF... Wait a second I thought. I bet Chrome is caching this crap and redownloading from the cache.

Resetting my hosts file, I loaded up Firefox, browsed to http://localhost and it worked. Clearing Chrome's cache also got this to work.

Damnit, I'm an idiot.