Telegram is a growing platform of instant messaging which has gained great popularity in the past few years. With its openness and superior user-friendliness, it has attracted a lot of users, along with spammers.
What people are using
Through my observation, most spammers spam in 3 languages (or rather scripts): Chinese (Traditional, Simplified, and Obfuscated), Arabic (and Persian), and as always, English. There are currently several anti-spam strategies commonly used among Telegram group admins/bots:
- Block by script. This is seemingly the most straightforward method, by simply blocking anyone who comes in and send anything in a certain language/script. This can be easily circumvented by sending just pictures with a link. Also this is not really useful in groups that discuss mainly in the same language as spammers.
- Block by patterns. Majority of spam messages seem to follow a specific pattern, be it full message matching or keywords. Some anti-spam bots maintain a list of such patterns and do it in a AdBlock Filter List-like way. But maintain such a list for universal purpose with a acceptable false-positive rate is hard. Pattern matching can be simply bypassed with replacement of similar Unicode characters, inserting invisible characters, or simply leave no distinguishable pattern in any text.
- Block by message type / links. Almost all spammers I have seen spam groups for one single purpose — advertising. That means spam messages must carry its ads in one way or another: Telegram channel links, shortened link to elsewhere, forwarded messages from channels, etc. Then some turned to blocking messages that fit in these categories. Surely, this is effective when a group is flooded with spams, but not really good in a long term due to its false-positive rate. Normal messages could also easily fall in one of these groups and get warned or banned.
- CAPTCHAs. The by far most classic strategy of anti-spam. Bot doing CAPTCHAs usually blocks a newcomer from sending messages until they answer the CAPTCHA. Common challenge types include code-in-picture, button-clicking, calculations, simple trivia and theme-based quizzes. Some groups even redirects user to a webpage to complete a Google reCAPTCHA. Similar to welcome bots, this can look bad when a lot people join the group in a short period of time.
What I am using
The strategy I am using now is rather a coincidental discovery. One day when I was playing with the Telegram Bot API, I found that there is an entry point to get an invite link for a group, This entry point works similar to the one in group property panel in a client: when a new link is generated, the older one is revoked. But this link is independent from the one generated with Telegram client, the 2 links can both work at the same time.
Inspired by this feature, I have updated the structure of my EH Forwarder Bot support group to incorporate it. As a software support group, there would be some information for a newcomer to know (read the docs, existing bugs, FAQs, how to ask questions, etc) for a more efficient communication. I have put all these summary in a public channel in a bottom-to-top order (i.e. first message to read is at the bottom). Each piece of information is on a separate message. I then put down the links to actual groups in the last (top-most) message.
Using the feature mentioned above, I wrote a Python script that grabs a new invitation link every day at 0:00 and update it into the message. Note that the message must be initially sent by the bot, or the bot would not be able to edit it after 48 hours.
How good it works
In a way, other than inviting by existing members, the only way to get into the group is to read through that README channel, and click into the link. This has slightly raised the bar of joining the group, but has also effectively minimised number of people going straight into the group asking stupid questions without reading the docs.
Spam-wise, as most spammers crawl invitation links from the internet, Telegram index, etc., and store in a link list for later use. With dynamic invitation links, the links crawled would probably no longer be effective at the time of use. Only seemingly feasible able way is to spam manually. In the past few year since this strategy is put into action, there was only one spammer managed to get in.
The drawback of this method is that it could making it somewhat harder for people to join the group. It might not be that suitable if the group trying to welcome as much people as possible.