telegramR: Scraping Telegram Channels with R
A practical guide to telegramR — an R client for Telegram’s MTProto API that lets you download messages, reactions, members, and media from any public channel directly into tibbles.
Why Telegram?
Telegram has become one of the most significant platforms for political communication, grassroots organising, and — unfortunately — disinformation. With more than 900 million monthly active users and a liberal API policy, it is a goldmine for social scientists, computational linguists, and data journalists alike.
Until recently, R researchers had to call Python’s Telethon through reticulate, juggle environment issues, and wrestle with type conversions. telegramR fixes that: a native R package that speaks Telegram’s MTProto binary protocol directly, with no Python dependency.
remotes::install_github("RomanKyrychenko/telegramR")Getting API Credentials
Before writing any R code you need a pair of credentials from Telegram:
- Log in at https://my.telegram.org.
- Go to API development tools.
- Create an application — the name and platform don’t matter for research use.
- Copy
api_id(a number) andapi_hash(a hex string).
Keep these values secret; treat api_hash like a password.
Connecting and Logging In
library(telegramR)
client <- TelegramClient$new(
session = "research_session", # saved to disk; reuse across runs
api_id = Sys.getenv("TG_API_ID"),
api_hash = Sys.getenv("TG_API_HASH")
)
client$start() # interactive: phone → SMS code → 2FA (if set)The session file persists your authorisation key. Delete it to force a fresh login.
For non-interactive pipelines (CI, scheduled jobs) you can pass everything explicitly:
client$start(
phone = "+15551234567",
code_callback = function() readline("Code: "),
password = Sys.getenv("TG_2FA")
)Downloading Channel Messages
The workhorse function is download_channel_messages(). Pass a username (without @) or a numeric channel ID:
msgs <- download_channel_messages(
client,
channel = "bbcnews",
limit = 500
)
dplyr::glimpse(msgs)The returned tibble has a rich schema:
| Column | Description |
|---|---|
message_id |
Unique message identifier |
date |
Timestamp (UTC) |
text |
Full message text |
views |
View count at download time |
forwards |
Number of times forwarded |
replies |
Reply count |
reactions_total |
Total emoji reactions |
reactions_json |
Per-emoji breakdown (JSON) |
media_type |
photo / video / document / … |
is_forward |
Whether the post was forwarded |
forward_from_name |
Original channel name |
channel_title |
Display name of scraped channel |
Filtering by Date
Avoid downloading years of history when you only care about a specific window:
msgs_jan <- download_channel_messages(
client,
channel = "bbcnews",
start_date = "2025-01-01",
end_date = "2025-01-31",
limit = Inf # fetch everything in the window
)Estimating Volume Before Downloading
Before pulling a large channel, check how much data you’re dealing with:
estimate_channel_post_count(client, "bbcnews")
#> [1] 48301This returns an upper-bound estimate without downloading any messages.
Channel Metadata
info <- download_channel_info(client, "bbcnews")
# Returns: id, title, username, description, member_count, creation_dateReactions, Replies and Members
# Per-message reactions with emoji breakdown
reactions <- download_channel_reactions(client, "bbcnews", limit = 1000)
# Replies to the most recent posts
replies <- download_channel_replies(
client,
channel = "bbcnews",
message_limit = 100 # look at last 100 posts
)
# Public subscriber list (where available)
members <- download_channel_members(client, "bbcnews", limit = 5000)Downloading Media
dir.create("media", showWarnings = FALSE)
media_index <- download_channel_media(
client,
channel = "bbcnews",
limit = 200,
media_types = c("photo", "video"),
start_date = "2025-01-01",
end_date = "2025-02-01",
out_dir = "media"
)
head(media_index)The function returns a tibble with the local file path alongside message metadata, so you can join it back to msgs by message_id.
A Mini Analysis: Reaction Trends
Here’s a small end-to-end snippet — download a week of posts, parse reactions, and plot engagement over time:
library(telegramR)
library(dplyr)
library(tidyr)
library(ggplot2)
library(jsonlite)
msgs <- download_channel_messages(
client, "bbcnews",
start_date = "2025-03-01", end_date = "2025-03-07",
limit = Inf
)
# Parse reactions JSON
reactions_long <- msgs |>
filter(!is.na(reactions_json)) |>
mutate(
emoji_data = lapply(reactions_json, function(x) fromJSON(x))
) |>
unnest_wider(emoji_data) |>
pivot_longer(cols = -c(message_id, date), names_to = "emoji", values_to = "count") |>
filter(!is.na(count))
# Plot top-5 reactions by day
reactions_long |>
mutate(day = as.Date(date)) |>
group_by(day, emoji) |>
summarise(total = sum(count), .groups = "drop") |>
slice_max(total, n = 5, by = day) |>
ggplot(aes(day, total, fill = emoji)) +
geom_col(position = "dodge") +
labs(
title = "Daily Telegram Reactions — BBC News",
x = NULL,
y = "Reaction count",
fill = "Emoji"
) +
theme_minimal()Practical Tips
- Rate limits — Telegram throttles heavy scrapers. Add
Sys.sleep(1)between calls when downloading large histories. - Session reuse — the session file caches the authorisation key. Store it safely; don’t commit it to git.
- Async returns — most low-level helpers return
futureobjects. Unwrap them withfuture::value()if you call them directly. - Debug logging — suppress verbose output with:
options(
telegramR.debug_pump = FALSE,
telegramR.debug_process = FALSE,
telegramR.debug_parse = FALSE
)Summary
telegramR brings full MTProto client functionality to R without any Python bridge. Whether you’re building a disinformation monitor, studying political communication, or just curious about a niche community, the package gives you clean tibbles ready for tidyverse pipelines — from raw channel scrape to publication-ready analysis entirely in R.