telegramR: Scraping Telegram Channels with R

R
telegramR
data collection
social media
NLP
text analysis

A practical guide to telegramR — an R client for Telegram’s MTProto API that lets you download messages, reactions, members, and media from any public channel directly into tibbles.

Author
Published

March 31, 2026

Why Telegram?

Telegram has become one of the most significant platforms for political communication, grassroots organising, and — unfortunately — disinformation. With more than 900 million monthly active users and a liberal API policy, it is a goldmine for social scientists, computational linguists, and data journalists alike.

Until recently, R researchers had to call Python’s Telethon through reticulate, juggle environment issues, and wrestle with type conversions. telegramR fixes that: a native R package that speaks Telegram’s MTProto binary protocol directly, with no Python dependency.

remotes::install_github("RomanKyrychenko/telegramR")

Getting API Credentials

Before writing any R code you need a pair of credentials from Telegram:

  1. Log in at https://my.telegram.org.
  2. Go to API development tools.
  3. Create an application — the name and platform don’t matter for research use.
  4. Copy api_id (a number) and api_hash (a hex string).

Keep these values secret; treat api_hash like a password.

Connecting and Logging In

library(telegramR)

client <- TelegramClient$new(
  session  = "research_session",   # saved to disk; reuse across runs
  api_id   = Sys.getenv("TG_API_ID"),
  api_hash = Sys.getenv("TG_API_HASH")
)

client$start()   # interactive: phone → SMS code → 2FA (if set)

The session file persists your authorisation key. Delete it to force a fresh login.

For non-interactive pipelines (CI, scheduled jobs) you can pass everything explicitly:

client$start(
  phone         = "+15551234567",
  code_callback = function() readline("Code: "),
  password      = Sys.getenv("TG_2FA")
)

Downloading Channel Messages

The workhorse function is download_channel_messages(). Pass a username (without @) or a numeric channel ID:

msgs <- download_channel_messages(
  client,
  channel = "bbcnews",
  limit   = 500
)

dplyr::glimpse(msgs)

The returned tibble has a rich schema:

Column Description
message_id Unique message identifier
date Timestamp (UTC)
text Full message text
views View count at download time
forwards Number of times forwarded
replies Reply count
reactions_total Total emoji reactions
reactions_json Per-emoji breakdown (JSON)
media_type photo / video / document / …
is_forward Whether the post was forwarded
forward_from_name Original channel name
channel_title Display name of scraped channel

Filtering by Date

Avoid downloading years of history when you only care about a specific window:

msgs_jan <- download_channel_messages(
  client,
  channel    = "bbcnews",
  start_date = "2025-01-01",
  end_date   = "2025-01-31",
  limit      = Inf           # fetch everything in the window
)

Estimating Volume Before Downloading

Before pulling a large channel, check how much data you’re dealing with:

estimate_channel_post_count(client, "bbcnews")
#> [1] 48301

This returns an upper-bound estimate without downloading any messages.

Channel Metadata

info <- download_channel_info(client, "bbcnews")

# Returns: id, title, username, description, member_count, creation_date

Reactions, Replies and Members

# Per-message reactions with emoji breakdown
reactions <- download_channel_reactions(client, "bbcnews", limit = 1000)

# Replies to the most recent posts
replies <- download_channel_replies(
  client,
  channel       = "bbcnews",
  message_limit = 100     # look at last 100 posts
)

# Public subscriber list (where available)
members <- download_channel_members(client, "bbcnews", limit = 5000)

Downloading Media

dir.create("media", showWarnings = FALSE)

media_index <- download_channel_media(
  client,
  channel     = "bbcnews",
  limit       = 200,
  media_types = c("photo", "video"),
  start_date  = "2025-01-01",
  end_date    = "2025-02-01",
  out_dir     = "media"
)

head(media_index)

The function returns a tibble with the local file path alongside message metadata, so you can join it back to msgs by message_id.

Practical Tips

  • Rate limits — Telegram throttles heavy scrapers. Add Sys.sleep(1) between calls when downloading large histories.
  • Session reuse — the session file caches the authorisation key. Store it safely; don’t commit it to git.
  • Async returns — most low-level helpers return future objects. Unwrap them with future::value() if you call them directly.
  • Debug logging — suppress verbose output with:
options(
  telegramR.debug_pump    = FALSE,
  telegramR.debug_process = FALSE,
  telegramR.debug_parse   = FALSE
)

Summary

telegramR brings full MTProto client functionality to R without any Python bridge. Whether you’re building a disinformation monitor, studying political communication, or just curious about a niche community, the package gives you clean tibbles ready for tidyverse pipelines — from raw channel scrape to publication-ready analysis entirely in R.

Source: https://github.com/RomanKyrychenko/telegramR