Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requested Files and Not Found panels duplicate information #2784

Closed
g9h0 opened this issue Jan 9, 2025 · 7 comments
Closed

Requested Files and Not Found panels duplicate information #2784

g9h0 opened this issue Jan 9, 2025 · 7 comments

Comments

@g9h0
Copy link

g9h0 commented Jan 9, 2025

The "Requested Files (URLs)" panel contains requests for files that do not exist. There is also a panel "Not Found Files (404)" which also contains the same requests. I'm not sure if this is an intentional design decision, but it would be nice if the "Requested Files" panel was populated with requests that return a 2xx status, as this would provide a cleaner view of which pages are most popular, without being cluttered with requests that returned a 4xx.

@allinurl
Copy link
Owner

Could you please provide a few sample lines from your log so I can replicate the issue? Additionally, feel free to share a screenshot of the problem for better clarity. Thanks!

@g9h0
Copy link
Author

g9h0 commented Jan 10, 2025

@allinurl Thanks for the reply. I have attached both logs and screen captures. These logs are structured logs from Caddy. They seems to be slightly different, with one returning a 308, and the other a 404. For reference, I don't run a wordpress site, so these requests are invalid. Thanks :)

{ "level": "info", "ts": 1736332408.880364, "logger": "http.log.access.log0", "msg": "handled request", "request": { "remote_ip": "x.x.x.x", "remote_port": 60259, "client_ip": "x.x.x.x", "proto": "HTTP/1.1", "method": "GET", "host": "g9h.io", "uri": "/wp-admin/js/about.php", "headers": {} }, "bytes_read": 0, "user_id": "", "duration": 0.000055891, "size": 0, "status": 308, "resp_headers": { "Location": [ "https://g9h.io/wp-admin/js/about.php" ], "Content-Type": [], "Server": [ "Caddy" ], "Connection": [ "close" ] } }

{ "level": "info", "ts": 1736332408.9551666, "logger": "http.log.access.log0", "msg": "handled request", "request": { "remote_ip": "x.x.x.x", "remote_port": "53205", "client_ip": "x.x.x.x", "proto": "HTTP/1.1", "method": "GET", "host": "g9h.io", "uri": "/wp-admin/js/about.php", "headers": {}, "tls": { "resumed": false, "version": 771, "cipher_suite": 49195, "proto": "", "server_name": "g9h.io" } }, "bytes_read": 0, "user_id": "", "duration": 0.00013649, "size": 0, "status": 404, "resp_headers": { "Server": [ "Caddy" ], "Alt-Svc": [ "h3=\":443\"; ma=2592000" ] } }

Screenshot 2025-01-10 at 13 58 56

Screenshot 2025-01-10 at 13 59 07

@allinurl
Copy link
Owner

From the two requests you shared, one returns a 308 status, while the other returns a 404. On my end, both behave as expected. If those files or directories don't exist, it seems that Caddy shouldn't be returning a 308 in this case, right?

2025-01-13-144441_650x384_scrot

@webstudiobond
Copy link

Confirmed. Similar problem with nginx.

The REQUESTS panel receives requests with status 404, 403, etc.

Perhaps we need an option to explicitly specify which statuses to consider for the REQUESTS panel.

Perhaps we need to output an additional column indicating status. (METHOD, PROTOCOL, STATUS, DATA)

The ideal would be to display multiple tables with their status, as is done for 404s
REQUESTS_2xx
REQUESTS_3xx
NOT_FOUND
REQUESTS_4xx (excluding NOT_FOUND)
REQUESTS_5xx

@allinurl
Copy link
Owner

It appears your application might be incorrectly returning a 308 when a page is not found, leading to a subsequent 404. goaccess simply reflects the log data. I'm not convinced that treating 308 as 404 with a flag is beneficial beyond this specific case. Since they represent different server responses, it might obscure valuable info.

#117 will address filtering and searching for these specific cases. Stay tuned.

If you need to handle 444 status codes as 404s in goaccess, use the --444-as-404 flag.

@g9h0
Copy link
Author

g9h0 commented Jan 16, 2025

After further investigation, I've realised that this problem is in fact caused by Caddy automatically upgrading requests from http -> https. I would agree that showing 308's in the requests panel is the correct thing to do, as a 308 could be a legitimate request.

To resolve the problem, I have added the following configuration to my Caddyfile, and closed port 80 on my firewall.

{ auto_https disable_redirects }

Caddy documentation reference: https://caddyserver.com/docs/caddyfile/options#auto-https

Many thanks for your help with this issue :)

@g9h0 g9h0 closed this as completed Jan 16, 2025
@allinurl
Copy link
Owner

happy to help! I’m glad you were able to sort it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants