执行摘要(又名TL; DR) (Executive Summary (aka TL;DR))

  • Most of the illegal health products reported in Singapore are marketed for sexual enhancement and weight loss.

  • Majority of the illegal drugs are labelled in Chinese

  • The most common dosage form design is that of a green oblong capsule

  • The top 5 most common adulterants are Sildenafil, Sibutramine, Tadalafil, Phenolphthalein, and Lignocaine


背景与动机 (Background and Motivation)

The lucrative nature of peddling counterfeit and adulterated drugs has led to the proliferation of the global trade of illegal health products. This is also driven by an increase in consumer demand for these products, and the low cost in purchasing them. The obvious downside is that the consumption of these health products purchased from dubious sources run the risk of causing serious health problems to the consumers.

兜售假冒伪劣药品的有利可图的性质导致了全球非法保健产品贸易的扩散。 消费者对这些产品的需求增加以及购买它们的低成本也推动了这一点。 明显的不利因素是,从可疑来源购买的这些健康产品的消费可能会给消费者带来严重的健康问题。

These products often contain undeclared ingredients to enhance the effects to entice consumers to take them. These ingredients may be hazardous (especially if taken without medical supervision), over- or under- dosed, banned or have not been assessed for safe use in humans.

这些产品通常含有未申报的成分,以增强效果以吸引消费者服用。 这些成分可能是危险的(尤其是在未经医学监督的情况下服用),剂量过大或不足,被禁止或未经评估可安全用于人类。

In a bid to better understand the illegal health product trade in Singapore, I decided to conduct an analysis of the publicly released list of detected and tested illegal health products reported by the Health Sciences Authority (HSA).


This analysis was sparked by a conversation I had with a fellow Entrepreneur First cohort member (Shrey Chaturvedi), when we were discussing about potential problems to tackle in the Southeast Asia counterfeit drug market. I decided it was a good opportunity to work on this mini-project related to what I enjoy — the intersection of drugs and data.

我与一位企业家第一组同事(Shrey Chaturvedi)进行的一次对话引发了这种分析,当时我们正在讨论东南亚假药市场需要解决的潜在问题。 我认为这是一个与我所喜欢的药物和数据相关的小型项目的好机会。

第一部分-数据采购 (Part I — Sourcing for Data)

The database for illegal health products reported by the HSA is available for public viewing on this page. Although this list is not exhaustive, it still serves as a reasonable sample of all the illegal health products being traded in Singapore.

HSA报告的非法保健产品数据库可在此页面上公开查看。 尽管此列表并不详尽,但仍可以作为在新加坡交易的所有非法健康产品的合理样本。

This dataset is presented as a data table on the site, and was accessed and retrieved on 15th August 2020.


Screenshot of HSA Illegal Health Products Database Search Results HSA非法保健产品数据库搜索结果的屏幕截图

The features available are as follows:


Table of dataset features (columns) and respective description 数据集特征(列)表及其说明

The contents of the database on the site was manually transcribed and saved as an Excel (.xlsx) file with the use of Notepad++ and Microsoft Excel. This was then imported into Jupyter Notebook for further analysis using Python 3.

使用Notepad ++和Microsoft Excel手动转录了站点上数据库的内容,并将其保存为Excel( .xlsx )文件。 然后将其导入Jupyter Notebook,以使用Python 3进行进一步分析。

Having come across street side counterfeit drug peddlers while overseas on holiday, my initial hypothesis is that these counterfeit drugs are mainly lifestyle products. In particular, the ‘best-seller’ is likely sex-enhancement drugs targeted at the adult male population.

我在国外度假时遇到街头假冒伪劣药品的小贩,我最初的假设是这些假冒伪劣药品主要是生活用品。 特别是,“畅销产品”可能是针对成年男性人群的性增强药物。

第二部分-数据清理 (Part II — Data Cleaning)

There are a total of 245 reported illegal health products documented in the HSA Illegal Health Products database at the time of this writing.


All 54 rows with null values in the ‘Dosage Form’ column were associated with tablets as displayed in the respective ‘Product Description’ column. These null values were thus filled with ‘Tablet’ as a dosage form entry, thereby creating a new Dosage Form category of ‘Tablet’ not utilized previously. The count of null values for each column in the final dataset is shown below:

“剂量表”列中所有具有值的54行都与相应的“产品说明”列中显示的片剂相关联。 因此,将这些值填充为“片剂”作为剂型条目,从而创建以前未使用的“片剂”的新剂型类别。 最终数据集中每列的值计数如下所示:

Frequency count of null values for each column

The null values for shape and color were mainly due to products packaged as vials and bottles, which are harder to characterize as compared to the usual oral tablets or capsules.


Given that there was plenty of missing data in the ‘Remarks for Dosage Form Marking’ column (i.e. 61.2% NA), the review of product markings was omitted in this analysis.

鉴于“剂型标记说明”列中有大量缺失数据(即61.2 NA ),因此在此分析中省略了对产品标记的审查。

There were 13 products with dosage form stated as ‘Pill’, which was an uninformative categorization. The product images were reviewed manually to determine its exact dosage form (i.e. tablet or capsule).

有13个产品的剂型称为“药丸”,属于非信息性分类。 手动检查产品图像以确定其确切剂型(即片剂或胶囊剂)。

第三部分-探索性数据分析的见解 (Part III — Insights from Exploratory Data Analysis)

Now to the main part of this analysis, which is the patterns and findings observed.


(i)产品说明 ((i) Product Description)

The product description gives the name of the product, along with some useful (albeit brief) information about its intended use for the consumer. Given that we are dealing with text strings here, it would be a good idea to use a word cloud generator (with Python WordCloud library) to illustrate the text frequency in these descriptions.

产品说明给出了产品名称,以及一些有关其预期用途的有用(尽管很简短)信息。 鉴于我们在这里处理文本字符串,最好使用词云生成器(带有Python WordCloud库)来说明这些描述中的文本频率。

After removing uninformative stop words (e.g. ‘Brand’, ‘Capsule’, ‘Jiao Nang’ (which means capsule in Chinese) ‘Tablet’), the following word cloud was produced:

删除无用的停用词(例如“ Brand”,“ Capsule”,“ Jiao Nang”(中文是胶囊)“ Tablet”)后,产生了以下词云:

Full word cloud for Product Description 产品描述的完整文字云

It is clear that most of these words are romanization of Chinese characters (i.e. Hanyu Pinyin). This suggests that many products are mainly labelled in Chinese, and interpreting this full word cloud is not the best way to understand the nature of these products.

显然,这些单词大多数是汉字(即汉语拼音 )的罗马化。 这表明许多产品主要以中文标记,并且解释完整的单词云并不是了解这些产品性质的最佳方法。

Given that the word cloud suggests a high occurrence of Chinese characters, it would be interesting to see the proportion of the illegal health products containing Chinese characters in their product description.


Proportion of illegal health products containing Chinese characters in product description

It can be seen that the list of illegal drugs is dominated by products labeled in Chinese (73.5%). This might imply that either the Chinese population is the primary target group for illegal drug dealers, or that these products are sourced primarily from China.

可以看出,非法药物清单以中文标签产品占主导(73.5%)。 这可能意味着中国人口是非法毒品交易者的主要目标人群,或者这些产品主要来自中国。

The next step is to generate a Chinese character word cloud. This was done by first using regular expression (regex) to extract all Chinese characters from the ‘Product Description’ column, and using Python’s jieba library to tokenize these Chinese characters. With certain Chinese stop words removed (e.g. 胶囊), the WordCloud library was used once again to generate a word cloud:

下一步是生成汉字词云。 首先,使用正则表达式( regex )从“产品描述”列中提取所有汉字,然后使用Python的jieba库对这些汉字进行标记化。 删除某些中文停用词(例如胶囊)后,再次使用WordCloud库生成词云:

containing Chinese characters in Product Description 包含汉字的产品的词云

With this Chinese word cloud, it becomes easier to understand the nature of the illegal health products.


  • Finding 1


    The word America (‘美国’) occurs frequently, implying that dealers tend to market their counterfeit drugs as products of USA, likely to falsely augment its appeal, quality and legitimacy towards consumers. This assertion was confirmed when I looked at these specific products containing ‘美国’, and found these products to be branded with phrases like ‘America Warrior’ and ‘America Viagra’. Another country observed in the word cloud is Germany (‘德国’), meaning that drugs produced in Germany are also associated with strong branding.

    “美国”(America,简称“ USA”)字眼经常出现,表示经销商倾向于将其假冒药品作为美国产品销售,从而有可能错误地提高其对消费者的吸引力,质量和合法性。 当我查看这些包含“美国”的特定产品并发现这些产品带有“ America Warrior”和“ America Viagra”之类的商标时,这一说法得到了证实。 在“云”一词中观察到的另一个国家是德国('德国'),这意味着在德国生产的药品也与强大的品牌效应有关。

  • Finding 2


    The two most common claims made by these health products relate to sexual performance and male genitalia (‘牛鞭’, ‘升阳’, ‘延时’, ‘魔根’), and weight loss (‘减肥’, ‘瘦身’). This is further supported by other frequently occurring nuanced words associated with sexual vitality (‘金聖力’, ‘战神’, ‘动力’, ‘天雄’, ‘神威’).

    这些保健产品提出的两个最常见的主张与性行为和男性生殖器(“牛鞭”,“升阳”,“延迟”,“魔根”)以及体重减轻(“减肥”,“App.svelte” )。 与性活力相关的其他频繁出现的细微差别词(“金圣力”,“战神”,“动力”,“天雄”,“神威”)进一步支持了这一点。

  • Finding 3


    There are several common words associated with dragon (‘龙牌’, ‘天龙’), which is not out of the ordinary since the dragon traditionally symbolizes potent and auspicious powers in Chinese culture.


  • Finding 4


    While drug names are not commonly mentioned in the Chinese descriptions, there is one obvious branded name that appeared frequently, which is Viagra (‘威哥’, ‘伟哥’).


It is also important to analyse the other products that did not contain any Chinese characters, which make up 26.5% of this illegal drugs dataset. After removing English and Malay stop words (e.g. Kapsul, Obat), the generated word cloud is shown below:

分析不包含任何中文字符的其他产品也很重要,这些产品占该非法药物数据集的26.5%。 删除英语和马来停用词(例如,Kapsul,Obat)后,生成的词云如下所示:

without Chinese characters in Product Description 不含汉字的产品的词云

Similar to what was seen earlier for the Chinese-labeled products, these products also appear marketed for weight loss and sexual enhancement for men. This is observed from the English words (e.g. ‘Men’, ‘Sexual’, ‘Slim’, ‘Weight’), as well as some of the suggestive Malay words (e.g. ‘Kuat’ (strong), ‘Untuk Pria’ (for men), ‘Tongkat Ali’ (herb known for its use in managing erectile dysfunction). All these insights support the earlier hypotheses that sexual enhancement drugs are the leading counterfeit products being traded.

与先前在中国贴标产品上看到的相似,这些产品也出现了针对男性减肥和增强性欲的市场。 从英语单词(例如“ Men”,“ Sexual”,“ Slim”,“ Weight”)以及一些暗示性的马来语单词(例如“ Kuat”(强),“ Untuk Pria”(例如男性),“东革阿里”(以治疗勃起功能障碍而著称的草药),所有这些见解都支持早期的假设,即性增强药物是主要的假冒产品。

  • Finding 5Another finding from the word cloud exploration is the presence of the Malay phrase ‘Asam Urat’, which means uric acid in English. High levels of uric acid in the blood is the cause of gout, and the frequent occurrence of the Malay phrase of uric acid suggests that there are many counterfeit products marketed for gout management as well.

    发现5从云探测这个词的另一个发现是存在马来语“ Asam Urat”,这在英语中表示尿酸 。 血液中高水平的尿酸是痛风的原因,而马来语的尿酸短语的频繁出现表明,市场上也有许多假冒产品用于痛风管理。

(ii) 剂量表 ((ii) Dosage Form)

The majority of these illegal health products are in oral solid dosage forms, namely capsules and tablets.


Distribution of the Dosage Forms

This is unsurprising since oral formulations offer many advantages such as:


  • Ease of manufacture, packaging and transport

  • Good chemical and physical stability

  • Relatively low cost of production

  • Simple and accurate dosing for consumers


(iii) 剂型颜色和形状 ((iii) Dosage Form Color and Shape)

Since white is the most commonly occurring color for oral medications in general, it is also not surprising to see that white is the predominant color for these illegal health products as well. The next three most common colors happen to be the standard primary colors of red, blue and green.

由于白色通常是口服药物中最常见的颜色,因此看到白色也是这些非法保健产品的主要颜色也就不足为奇了。 接下来的三种最常见的颜色恰好是红色,蓝色和绿色的标准原色。

Distribution of the Dosage Form Colors 剂型颜色的分布

In terms of the dosage form shape, the predominant shape is oblong.


Distribution of the Dosage Form Shape 剂型形状的分布

By looking at the above charts separately, you might imagine that the most common design is that of a white oblong capsule. However, we should be looking at it from the combined sequence of dosage form, color and shape. With that, it is evident that the most common dosage form design was in fact a green oblong capsule instead of a white one.

通过分别查看上面的图表,您可能会想到最常见的设计是白色长方形胶囊的设计 。 但是,我们应该从剂型,颜色和形状的组合顺序来看待它。 由此可见,最常见的剂型设计实际上是绿色的椭圆形胶囊而不是白色的椭圆形胶囊

Top 10 most common Dosage Form designs

Here is an example of what a green oblong capsule looks like:


(iv)剂型标记 ((iv) Dosage Form Markings)

It can be seen that the majority of these illegal health products (60.6%) do not have any markings on them. This makes it all the more difficult to pinpoint the specific identity of these drugs, as well as to distinguish the counterfeit products from the real ones.

可以看出,这些非法保健产品中的大多数(60.6%)都没有任何标记。 这使得查明这些药物的具体身份以及区分假冒产品与真实产品更加困难。

Distribution of products with and without markings/engravings 带有或不带有标记/雕刻的产品经销

(v) 掺假者 ((v) Adulterants)

Adulterants are the undetected and unapproved potent medicinal ingredients in the health product, and these adulterants are the main reasons why these counterfeit products are deemed illegal.


These adulterants can cause serious adverse health effects owing to accidental misuse, overuse, or interaction with other medications, underlying health conditions, or other ingredients within the supplement.


The following table show the top 10 most common adulterants that form a part of these illegal products.


Top 10 adulterants discovered in the illegal health products

Now would be a good time to bring in some pharmacological knowledge into the fray, by describing the top 5 adulterants seen in the above table.


(1) SildenafilWhat is it? Sildenafil is by far the most common adulterant found in illegal health products. Sildenafil is the generic name of the active compound found in Viagra (brand name), and is used to treat erectile dysfunction (ED).

(1)西地那非是什么? 迄今为止,西地那非是在非法保健产品中最常见的掺假品。 西地那非是在伟哥中发现的活性化合物的通用名称(商标名称),用于治疗勃起功能障碍(ED)。

Fun Fact: Sildenafil was originally developed for the treatment of pulmonary hypertension and angina pectoris (chest pain due to heart disease). However, during the clinical trials, researchers found the drug to be more effective at inducing erections than treating angina.

有趣的事实:西地那非最初开发用于治疗肺动脉高压和心绞痛(由于心脏病引起的胸痛)。 但是,在临床试验期间,研究人员发现该药物比引起心绞痛更有效地诱发勃起。

Side Effects: While most side effects are generally mild (e.g. flushing, headaches), there is a risk of more severe reactions such as vision loss, priapism (persistent and painful erection), and severe hypotension. These risks are certainly amplified when consuming these counterfeit products, whereby the dosages contained within them are poorly regulated.

副作用:虽然大多数副作用通常是轻度的(例如潮红,头痛),但存在更严重的React的风险,例如视力丧失,阴茎异常勃起(持续性和疼痛性勃起)以及严重的低血压 。 当消费这些假冒产品时,这些风险肯定会加剧,从而限制了其中所含的剂量。

(2) SibutramineWhat is it? Sibutramine is a compound used for weight loss. This supports the insights gathered earlier that weight loss is a common area which illegal health products tend to target. Sibutramine works by increasing thermogenesis and making the user feel fuller after a meal, and it has mainly been studied in obese patient populations.

(2) 西布曲明是什么? 西布曲明是用于减肥的化合物。 这支持了早先收集到的见识,即减肥是非法保健产品往往针对的常见领域。 西布曲明通过增加产热作用和使用户饭后感觉更饱满而起作用,并且主要在肥胖患者人群中进行了研究。

Side Effects: It is known to be associated with an increased risk of cardiovascular events like heart attack, stroke and hypertension.

副作用:已知与心血管事件(心脏病发作,中风和高血压)的 风险增加 有关。

(3) TadalafilWhat is it? Being in the same drug class as Sildenafil (i.e. Phosphodiesterase-5 Enzyme Inhibitors), Tadalafil is also used to treat ED, and shares many similar features with Sildenafil. Its brand name is Cialis.

(3) 他达拉非是什么? 与西地那非属于同一类药物(即磷酸二酯酶-5酶抑制剂),他达拉非也用于治疗ED,并且与西地那非具有许多相似的特征。 它的品牌名称是Cialis。

(4) PhenolphthaleinWhat is it? Phenolphthalein is a laxative, and is commonly found in adulterated weight loss products. Multiple dietary supplements containing Phenolphthalein and Sibutramine have been previously recalled by the U.S. Food and Drug Administration (FDA) due to their unapproved inclusion in the supplements.

(4)酚酞是什么? 酚酞是泻药,通常在掺假减肥产品中发现。 美国食品药品监督管理局(FDA)之前曾召回多种含有酚酞和西布曲明的膳食补充剂,原因是它们未经批准地包含在其中。

Side Effects: Phenolphthalein exposure has been associated with carcinogenicity (i.e. can cause cancer).

副作用:酚酞的暴露与致癌性有关 (即可能导致癌症)。

(5) LignocaineWhat is it? Lignocaine is an anesthetic, meaning that it is a medication used to numb tissues in a specific body region when applied topically on the skin or mucous surface. It typically comes in the form of topical products (e.g. spray, gels), and is used to manage premature ejaculation due to its numbing effect.

(5)利多卡因是什么? 利诺卡因是麻醉剂,这意味着当局部应用在皮肤或粘液表面时,它是一种用于麻木特定身体区域组织的药物。 它通常以局部用药(例如喷雾剂,凝胶剂)的形式出现,由于其麻木作用而被用于处理早泄。

Side Effects: Topical use has generally mild side effects such as itch/redness, but the potentially poor quality of the illegal product may lead to unexpected hypersensitivity/allergic skin reactions.


From here, I decided to take a closer look at these Lignocaine-containing products, to confirm my hypothesis that all supplements containing Lignocaine are of the topical dosage form.


Dosage Form distribution of Lignocaine-containing products

Although the most common dosage form is indeed something used for topical application (liquid for external use), I was surprised to see that there were quite a number of oral capsule and tablet products containing Lignocaine as well, especially since Lignocaine is not meant for oral consumption. This is certainly something worth looking deeper into.

尽管最常见的剂型确实是局部用药(外用液体),但令我惊讶的是,有相当多的口服胶囊和片剂产品也含有利诺卡因,尤其是因为利诺卡因不适合口服消费。 这当然值得深入研究。

下一步 (Next Steps)

This analysis offers some insights into the trends and patterns observed in the illegal health product trade in Singapore, which can be of value for authorities like the HSA. For example, by understanding the common marketed claims and main groups of people targeted by illegal dealers, specially curated education campaigns can be conducted to raise awareness so that the public understands how to better protect themselves.

该分析为新加坡非法保健产品贸易中观察到的趋势和模式提供了一些见解,这对于像HSA这样的机构可能具有价值。 例如,通过了解非法经销商所针对的共同市场主张和主要人群,可以开展专门策划的教育运动以提高认识,从而使公众了解如何更好地保护自己。

Having access to more data (such as place of purchase, country of manufacture, date of reporting, profiles of peddler/consumer etc.) would also help to make this analysis more insightful and useful in tackling the peddling of such illegal products.


整理东西 (Wrapping things up)

In reality, it can be very challenging to distinguish real products from counterfeit ones. So here is some advice from your friendly pharmacist: Always purchase your medications and health products from registered medical institutions, clinics and pharmacies. If you do encounter such illegal products being peddled, you may report it to the HSA Enforcement Branch via phone (6866 3485) or email (hsa_is@hsa.gov.sg)

实际上,将真实产品与假冒产品区分开来可能非常具有挑战性。 因此,这里有您友善的药剂师的一些建议:务必从注册的医疗机构,诊所和药房购买药物和保健品。 如果您确实遇到了此类违法产品被贩卖,则可以通过电话(6866 3485)或电子邮件(hsa_is@hsa.gov.sg)将其报告给HSA执法部门。

I look forward to hearing your feedback on the above analysis. If you would like to have a copy of the code, or wish to discuss more about this review, please feel free to drop me a message on LinkedIn!

我期待听到您对以上分析的反馈。 如果您想获得该代码的副本,或希望讨论有关此评论的更多信息,请随时在LinkedIn上给我留言!

Special thanks to Koo Ping Shung for sharing his feedback on the analysis

特别感谢 Koo Ping Shung 分享他对分析的反馈

