Quantcast
Channel: Piwik Forums - Support & Bugs
Viewing all articles
Browse latest Browse all 4243

Working on Apache Logs (no replies)

$
0
0
Hey folks!

Yes, I did read the FAQ, Manual and spent about an hour of google'ing around. So far I came across outdated versions or no longer valid informations.
Summary: I do have piwik up and running, even with GeoIP stuff, so far it works great. Now, I can't modify the websites of the pages, so I have to rely on the apache logs, which I already modified to include the requested (sub)domain. Logs are in /var/log/http/$domain/$fqdn-access.log, ie, /var/log/http/tree.com/stump.tree.com-access.log. There are always new hosts / domains coming and going, so there is an unknown number of subdomains to be handled.

I built my log-import script like this:
#! /usr/local/bin/bash

# Configuration
BIN="/usr/local/www/piwik/misc/log-analytics/import_logs.py"
URL="http://server/piwik/"
SMP="4"
EXTRA="--enable-http-errors --enable-http-redirects --enable-static --enable-http-redirects --enable-reverse-dns --enable-bots --add-sites-new-hosts"

find /var/log/httpd/ -type f -iname "*access*" | xargs $BIN --url=$URL --recorders=$SMP $EXTRA

This actually adds new sites for... new sites (duh), as required. All nice and easy - yay!

Now the tricky part, also known as Problem (dun-dun-dun)...
The logfiles are deleted each month (actually backed-up, then deleted). This also means that until then, the logs are not rotated. The beforementioned script does run every 3 hours, yielding in what I can see in pretty much duplictated entries. Hence the problem. (Is this a bug or a missing feature?)

I also noticed there is an archive.php script, which use is currently a mistery to me. Does this delete the duplicated entries? After a run of archive.php a site that had 4 visits (really 2 visits, but with one duplictate log entry 4) still remains 4. It does drop if I delete all the piwik_archive_* tables, but uh... This would mean I'd have to:

- run the update script,
- run the archive script,
- drop all archive dbs.

every 3 hours!

I know I am missing something blatantly obvious here. The question is: How do I really go for updating Piwik with apache logs (which can't be rotated)?

Thank you very much in advance,
great work with piwik,
-Christian.

Viewing all articles
Browse latest Browse all 4243

Trending Articles