Question
Table: Tweets
+----------------+---------+
| Column Name | Type |
+----------------+---------+
| tweet_id | int |
| content | varchar |
+----------------+---------+
tweet_id is the primary key (column with unique values) for this table.
This table contains all the tweets in a social media app.
Write a solution to find the IDs of the invalid tweets. The tweet is invalid if the number of characters used in the content of the tweet is strictly greater than 15.
Return the result table in any order.
The result format is in the following example.
MySQL Solutions
SELECT
tweet_id
FROM
Tweets
WHERE
LENGTH(content) > 15;
SELECT
tweet_id
FROM
Tweets
WHERE
CHAR_LENGTH(content) > 15;
Pandas Solution
- apply(lambda x: foo(x) ? condition)
def invalid_tweets(tweets: pd.DataFrame) -> pd.DataFrame:
temp = tweets[tweets['content'].apply(lambda x: len(x)>15)]
return temp[['tweet_id']]
- Series.str.len()
- compute the length of each element in the Series/Index
- the element can be a sequence (e.g., string, tuple, or list) or a collection (e.g., dictionary)
def invalid_tweets(tweets: pd.DataFrame) -> pd.DataFrame:
temp = tweets[tweets['content'].str.len()>15]
return temp[['tweet_id']]