Comparing geotagged tweet volumes available with the Twitter Premium Search API, the Twitter Search API, and TWINT

Setup and search for tweets using the Twitter Premium Search API (counts endpoint)

How do the hourly and daily endpoints compare?

When grouped by day, the hourly data is a close match to the daily data. However, especially for days with low tweet volume, the aggregated hourly data appears to be a slight over-estimate. This is explained by the quantization of low value aggregate data, seen below.

Quantization of data

The counts endpoint appears to quantize/obfuscate any counts between 1 and 5 to a value of 5. Interestingly, this only appears to be the case for georeferenced queries. (Shown further below in this notebook.)

Setup and search for tweets using TWINT

How does this compare to the Premium Search API counts?

It appears that there are many more tweets returned by TWINT in the previous 7 days than are counted by the Premium Search API. However, beyond this period, they are very similar. (Except for a few cases of more tweets returned by the Premium Search API counts endpoint, which could be a consequence of deleted tweets.)