Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Pointing out more instances of a problem doesn't make the problem "hardly inappropriate"; it indicates the problem is more widespread.


I hear they use Ford vehicles and Boeing airplanes as well. Highly inappropriate for the US Government to use products from some of the US's biggest companies!

/s


Analytics makes websites better, and Piwik is not easy or cheap to use at scale. Sharing web traffic data with Google Analytics (especially with IPs anonymized) is a pretty small issue IMO, especially when the benefits are you get good data like what's on analytics.usa.gov.

(Disclosure: I work on analytics.usa.gov.)


> (especially with IPs anonymized)

While I'm sure you have good intentions in trying to anonymize the collected data, you should know that the claims made by Google at [1] are highly misleading. Their claim is that they strip the last octet of the IP address, but that isn't nearly enough cooking to call the data "anonymous".

Assuming Google actually does this and isn't simply lying about saving the full IP (which has to be sent to them, as it is obviously in the SRC field of the IP header), this means they are only binning IPs into groups of 256. They only need at most 8 bits of identifying data other than the IP to uniquely correlate individuals within each IP group.

More than 8 bits of unique ID are available, given that various types of data that being tracked[2] about the user agent and (presumably) IP geolocation.

Also, Google is keeping enough of the IP to lookup the ASN.

> I work on analytics.usa.gov.

So if you're actually interested in claiming that "The program does not track individuals, and anonymizes the IP addresses of visitors"[2], then you really shouldn't be using an analytics service that - in spite of their claims of anonymizing" some data - obvious is tracking individuals and saving the most interesting parts of their IP address.

> Sharing web traffic data with Google Analytics [...] is a pretty small issue IMO,

Building profiles of everything we read or do online is probably the largest problem we will have going forward into the future. Data doesn't go away, and the problem compounds when the data can be correlated with other types of data. If you think this isn't a problem, then you really need to study what is possible with pattern-of-life analysis[4].

[1] https://support.google.com/analytics/answer/2763052?hl=en

[2] https://analytics.usa.gov/data/

[3] https://analytics.usa.gov/#explanation

[4] https://en.wikipedia.org/wiki/Pattern-of-life_analysis


From https://analytics.usa.gov/#explanation it sounds like it just happens to use Google Analytics at the moment, but could switch in the future.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: