Call me a tease, but I’ve finally gotten around to posting some sample pages for my server statistics reporting tool, Stats++ . No downloads yet. I’ve got a few bugs still to work out and a lot of polishing, documentation, etc., but I’m making good progress.
Here are some sample screens I saved from the development server:
(Links open in new window)
The Dashboard
A traffic graph along with the Top 10 Lists:
- Busiest Pages, sortable by any column — traffic summary of hits, pages, new visitors, visitors, bandwidth, robots, etc.
- New Visitor Entry Pages — the pages which brought in the most new visitors
- Top Referers — The pages which referred the most traffic to pages on the site. Detailed information about which pages were referred can be found in the Referer Report.
Daily Traffic Report
A traffic report summarizing the traffic for, in this case, the month to date. The timespan is fully controllable using the form up in the navigation bar.
A traffic report for a single page.
This page basically provides similar information to the Traffic-By-Day report but includes all of the referers which have linked to this page. It also allows filtering to a specific query string and aritrary date ranges
Referer Traffic Report
Similar to the Page Report, but this one reports on traffic referred from a page, along with a summary of pages to which it links.
I also generate Operating System and Browser statistics, but left those off for the time being.
It’s a database-driven, web-based traffic reporting application, written using PHP4, ADO, JPGraph (Free for non-commercial use), PostgreSQL and PERL. Features include:
- Data-driven reporting of Web server traffic statistics
- Multiple server support
- Input filtering
- Session tracking
- Detailed referer tracking
- Smart-ish (IMHO) differentiation of “pages” versus “hits”
- Real-time data loading
Database performance remains solid with about 40,000 rows of request data and 2,200 hosts. Everything is still rendering in under 2 seconds. This is in a CoLinux virtual machine on my laptop. I don’t know how long it will continue to scale before the aggregates get out of hand even with index scans.
I expect I will need to refactor those figures out of the fact tables, essentially pre-calculating the request data, but would prefer to get a release out the door before I tackle such a major rework of the back-end. I started this data design route because it gave me a lot more flexibility, but will pay the price later. Now I’m just ignoring the problem. I’m also delaying this because it will require some serious thinking about where I’m headed with regards to my choice of database.
I’m on PostgreSQL at the moment, and while it would be better suited to the sort of back-end I’m envisioning long-term, I know that I’m in the minority with it as compared to MySQL. I don’t really see supporting multiple Databases long-term if it’s a major chunk of the codebase. It’s one thing to provide multiple schema documentations. It’s another thing entirely to be maintaining multiple codebases of stored procedures.
For now, though, that’s another problem for another day.