Tuesday, April 14, 2026

Location, Location, Location

One of the fascinating things about web stats is seeing where the various visitors to your site come from.  Or supposedly come from.  According to my IP address, for instance, I can be either in Leeds, Rochdale or Wakefield, whereas, in reality, I'm physically located at the other end of the country.  It's down to the fact that there days I'm with a relatively small ISP, whose servers are all based in the Leeds area, so that's where I appear to be when I surf the web.  Which can cause problems when visiting sites offering localised services: weather reports, for instance - if I allow the site to automatically detect my location then I end up getting a weather report for northern climes, which is useless to me.  Which is why I have to specify my locale myself.  It could be worse - when I was with a larger ISP, I sometimes found myself apparently being in California or even South Korea, so widespread was its server network.  I'm guessing that at times of high demand, they just routed customers via whichever servers were available.  That said, most of the larger UK ISPs have networks extensive enough that you'll be routed through a server physically closer to your real location.  Which brings me to the point I was originally intending to make: that every time I see visitors to The Sleaze or to this blog coming via servers in my local area, I immediately assume that they must be someone I know.  Which is obviously ridiculous, as I only know personally an infinitesimal number of the people who live locally to me and the majority of them don't know that I run these sites. Plus, as I've already indicated, there is no guarantee that they really are, physically, in my area - they could be at the other end of the country and simply being routed via a local server.

While often this locational confusion is simply a result of ISP routing, increasingly it is deliberate.  Not just as the result of an increased use of VPNs.  Traffic stats increasingly seem dominated by bots, which routinely mask their true origin and identity by routing through servers and networks geographically remote from their point of origin.   Traditionally, these have been 'content scrapers', looking for data for usually dubious marketing schemes.  Increasingly, though, they are scraping content for the benefits of various AIs.  None of them seem to want to openly identify themselves - not even the 'household name' AIs.  Google's Gemini, for instance, scapes (or indexes, as they would have it), in the guise of the regular Google bots used to index sites for their search engine.  Similarly, Chat GPT scrapes under the guise of Microsoft's Bing bots.  Why so reticent about revealing their true identity?  Well, probably because they fear being blocked by webmasters if they scrape openly as AIs.  This way, because users won't be able to tell the difference between Microsoft and Google bots legitimately indexing for their search engines and AI scrapers, they won't get blocked.  Of late, one of my stats providers took the unilateral decision to block from customer's stats everything they deemed to be a bot. Unfortunately, their criteria for bot classification seem very shaky, based on location more than anything.  Swept up in this are all manner of legitimate visits, using VPNs or Google's AMP format, somewhat invalidating the stats we do see.  Moreover, it is actually important to see bot visits - what they are scraping/indexing is of as much interest as where they come from.  Some of the now blocked bot visits are of more direct use - Facebook bot visits, for example, give an idea of the traffic your site is getting from them and which pages are generating it.  (Facebook caches versions of site pages indexed there, so that when visited from Facebook, they don't generate a direct hit with your stats, but rather a bot visit - an oversimplification, but you get the gist of it, I'm sure).  

Not the stats provider in question seems to care about any of this, as they themselves don't seem to understand any of it and simply want to pander to their less informed customers who just see these bots a nuisance messing up their visitor statistics.  Which, of course, simply makes it ever more difficult to keep proper track of who and what are visiting your site any why - all vital questions to the serious webmaster.  But hey, getting back to my earlier point - if you are someone visiting here that does know me, get in touch properly, why not?  My mobile number hasn't changed - drop me a text, or something.  I'm not entirely anti-social, you know!

Labels:

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home