A representative random sample of tweets


The image above shows the rate at which new data is being sampled from Twitter into the database, in 30 second bursts. For example, around 1,000 to 3,000 tweets are loaded in each burst. Sampling bursts take place once an hour at a random time.

The actual sampling rate is unknown because it depends on the proportion of tweets provided in Twitter's streaming API which varies over time. Studies have estimated this to be between 1% and 40% of all actual tweets. The database behind this app then has a sample of 1/120th of those; so the sample here probably ranges between one in 12,000 and 1 in 300.

The number of tweets in the sample show both daily and weekly periodicity; peak time is around 15:30 UTC each day.