评论拓客、提取爬虫截流源码|短视频评论截流监控软件

前言:

        短视频平台已经成为人们获取信息和娱乐的重要渠道。在视频平台上,用户可以发表自己的评论,表达对视频内容的看法和感受。这些评论不仅反映了观众的观点,也蕴含着丰富的信息和价值。通过评论截流监控工具能够帮助我们快速、准确地获取和分析这些评论数据,从而更好地了解用户需求和市场动态。

从而我们自己编写了一套最新云算力版的评论分析软件;为什么做软件呢,这样更方便直观地让新手小白理解。

        现在开始讲述部分研发思路和软件界面展示:

  1. 获取需要的自动操作

为什么要用到自动操作呢?短视频评论区在用户不下拉情况下不加载的,当需要去看视频时点击评论标识按钮才会加载。

我们开始讲解UI自动操作相关的思路:

第一:获取到一个视频,加载页面后点击视频评论按钮(这里需要的是点击一次,做一个逻辑上的处理)

第二:下拉评论列表层

点击代码和下拉代码如下图示例;

  点击层

  chromeBrowser2.ExecuteScriptAsync(@"

                    var divElement = document.querySelector('div.kT7icnwc');

                    if (divElement) {

                        divElement.click();

                    }

                ");

下面代码是 douyin视频层需要 下拉评论层的代码

 chromeBrowser2.ExecuteScriptAsync(@"

                    var commentList = document.querySelector('div[data-e2e=""comment-list""]');

                    if (commentList) {

                        commentList.scrollTop = commentList.scrollHeight;

                    }

                ");

软件我们做了三个模块;全方面的赋能应用到监控、分析。如下示:

爬取页面需要分析数据页面对应标签;这里展示一部分 解析作者标签的代码 和视频基础信息的代码 用的是正则表达式加字符串函数   

string htmlContent = html;

            try

            {

                // string title = "";//获取title值 标题 视频名称

                Regex regex = new Regex(@"<span class=""j5WZzJdp y7epAOXf hVNC9qgC"">(.*?)</span>", RegexOptions.IgnoreCase);

                Match match = regex.Match(htmlContent);

                if (match.Success)

                {

                    // 获取匹配到的第一个组(即<title>和</title>之间的内容)

                    zuozhe_name = match.Groups[1].Value;

                    //发布时间:

                    zuozhe_name = zuozhe_name.Replace("<span>", "");

                    zuozhe_name = zuozhe_name.Replace("/", "");

                }

            }

            catch

            {

                //MessageBox.Show("608");

            }

 private void zuozhe_url_ceng(string html)

        {

            string htmlContent = html;

            string pattern = "<div class=\"uUjpLYc2 k13DwHsB O1xRgMXN\">.*?href=\"([^\"]+)\"";

            Regex regex = new Regex(pattern);

            Match match = regex.Match(html);

            if (match.Success)

            {

                string href = match.Groups[1].Value;

                zuozhe_url = href;

            }

        }

   private void shipin_dates_ceng(string html)

        {

            string htmlContent = html;

            try

            {

                // string title = "";//获取title值 标题 视频名称

                Regex regex = new Regex(@"<span class=""time"">(.*?)</span>", RegexOptions.IgnoreCase);

                Match match = regex.Match(htmlContent);

                if (match.Success)

                {

                    // 获取匹配到的第一个组(即<title>和</title>之间的内容)

                    shipin_dates = match.Groups[1].Value.Trim();

                    //发布时间:

                    shipin_dates = shipin_dates.Replace("<span>", "");

                    shipin_dates = shipin_dates.Replace("/", "");

                    shipin_dates = shipin_dates.Replace("·", "");

                    shipin_dates = shipin_dates.Replace("日", "");

                    shipin_dates = shipin_dates.Replace("年", "-");

                    shipin_dates = shipin_dates.Replace("月", "-");

                    string day = "";

                    // try

                    // {

                    Regex yearRegex = new Regex(@"\b\d{4}\b");

                    Regex dateRegex = new Regex(@"\b\d{1,2}-\d{1,2}\b");

                    // 判断字符串中是否包含年份信息

                    if (yearRegex.IsMatch(shipin_dates.Trim()))

                    {

                        // Console.WriteLine("输入字符串包含年份信息");

                    }

                    else if (dateRegex.IsMatch(shipin_dates.Trim()))

                    {

                        // Console.WriteLine("输入字符串不包含年份信息,但包含日期信息");

                        shipin_dates = "2024-" + shipin_dates.Trim();

                    }

                    else

                    {

                        Console.WriteLine("输入字符串既没有年份信息,也不符合日期格式");

                        #region

                        //DateTime shipin_dates_y = Convert.ToDateTime(shipin_dates);

                        //if (shipin_dates_y.Year != 1)

                        //{

                        //    shipin_dates = "2004-" + shipin_dates.Trim ();

                        //    Console.WriteLine("这个日期变量包含年份。");

                        //}

                        //else

                        //{

                        //    shipin_dates = "2004-" + shipin_dates.Trim ();

                        //    //Console.WriteLine("这个日期变量不包含年份。");

                        //}

                        #endregion

                        //  }

                        //  catch

                        // {

                        char delimiter = '·';

                        int index1 = shipin_dates.IndexOf(delimiter);

                        if (index1 != -1)

                        {

                            string textBeforeDelimiter = shipin_dates.Substring(0, index1);

                            shipin_dates = textBeforeDelimiter;

                            Console.WriteLine("Text before delimiter: " + textBeforeDelimiter);

                        }

                        if (shipin_dates.Contains("天"))

                        {

                            //  pinglun_riqi_yuanshi = extraInfo;

                            int index = shipin_dates.IndexOf("天");

                            day = shipin_dates.Substring(0, index);

                            DateTime dt = DateTime.Now.Date.AddDays(-Convert.ToInt32(Convert.ToInt32(day)));

                            shipin_dates = dt.ToShortDateString();

                        }

                        if (shipin_dates.Contains("月"))

                        {

                            //pinglun_riqi_yuanshi = extraInfo;

                            int index = shipin_dates.IndexOf("月");

                            day = shipin_dates.Substring(0, index);

                            DateTime dt = DateTime.Now.Date.AddMonths(-Convert.ToInt32(Convert.ToInt32(day)));

                            shipin_dates = dt.ToShortDateString();

                        }

                        if (shipin_dates.Contains("小时"))

                        {

                            // pinglun_riqi_yuanshi = extraInfo;

                            int index = shipin_dates.IndexOf("小时");

                            day = shipin_dates.Substring(0, index);

                            DateTime dt = DateTime.Now.Date.AddHours(-Convert.ToInt32(Convert.ToInt32(day)));

                            shipin_dates = dt.ToString();

                        }

                        if (shipin_dates.Contains("分钟"))

                        {

                            //pinglun_riqi_yuanshi = extraInfo;

                            int index = shipin_dates.IndexOf("分钟");

                            day = shipin_dates.Substring(0, index);

                            DateTime dt = DateTime.Now.Date.AddMinutes(-Convert.ToInt32(Convert.ToInt32(day)));

                            shipin_dates = dt.ToString();

                        }

                        if (shipin_dates.Contains("周"))

                        {

                            //  pinglun_riqi_yuanshi = extraInfo;

                            int index = shipin_dates.IndexOf("周");

                            day = shipin_dates.Substring(0, index);

                            int week = (Convert.ToInt32(day) * 7);

                            DateTime dt = DateTime.Now.Date.AddDays(-Convert.ToInt32(week));

                            shipin_dates = dt.ToShortDateString();

                        }

                        if (shipin_dates.Contains("年"))

                        {

                            //  pinglun_riqi_yuanshi = extraInfo;

                            int index = shipin_dates.IndexOf("年");

                            day = shipin_dates.Substring(0, index);

                            DateTime dt = DateTime.Now.AddYears(-Convert.ToInt32(Convert.ToInt32(day)));

                            shipin_dates = dt.ToShortDateString();

                        }

                        //判断当前时间是否和视频时间 是否大于

                        DateTime a = DateTime.Now; // 当前时间

                        DateTime b = DateTime.ParseExact(shipin_dates, "yyyy-MM-dd", System.Globalization.CultureInfo.InvariantCulture);// 视频时间,假设为 2022-05-27

                        TimeSpan interval = a - b; // 计算时间间隔

                        if (Math.Abs(interval.TotalDays) <= 730) // 判断时间间隔是否小于等于两年//这个里面的值  通过字段获取

                        {

                            Console.WriteLine("视频时间和当前时间在两年内");

                        }

                        else

                        {

                            Console.WriteLine("视频时间和当前时间不在两年内");

                        }

                    }

                }

            }

            catch

            {

                //MessageBox.Show("608");

            }

        }

        private void title_ceng(string html)

        {

            // string pattern = @"<span class=""j5WZzJdp IoRNNcMW hVNC9qgC"">(.*?)</span>";

            // MatchCollection matches = Regex.Matches(html, pattern);

            // // 提取匹配结果

            // //foreach (Match match in matches)

            // //{

            //     title = matches.Groups[1].Value.Trim();

            //     // 输出匹配到的内容

            //   //  Console.WriteLine(match.Groups[1].Value.Trim());

            }

            ///

            string htmlContent = html;

            try

            {

                // string title = "";//获取title值 标题 视频名称

                Regex regex = new Regex(@"<span class=""j5WZzJdp IoRNNcMW hVNC9qgC"">(.*?)</span>", RegexOptions.IgnoreCase);

                Match match = regex.Match(htmlContent);

                if (match.Success)

                {

                    // 获取匹配到的第一个组(即<title>和</title>之间的内容)

                    title = match.Groups[1].Value;

                    //发布时间:

                    title = title.Replace("<span>", "");

                    title = title.Replace("/", "");

                }

            }

            catch

            {

                //MessageBox.Show("608");

            }

        }

通过短视频评论区截流是助力运营者分析运营非常实用的一种方法;下边是部分思路源码截图,在下篇文章中我会详细讲解下方截图的应用实例。

  • 6
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
投票源码uinapp是一款功能强大的投票插件,可以用于公众号、小程序等平台的男神女神评选、拓客活动等。该插件提供了丰富的功能和优秀的用户体验,方便用户进行投票和拓展自己的粉丝群。 首先,投票源码uinapp拥有简洁直观的界面设计,用户可以轻松地进行投票操作。用户在投票过程中,可以查看候选人的相关信息,包括照片、个人简介等,从而更加了解每位候选人的特点和魅力。同时,用户可以方便地分享自己的投票结果到社交媒体上,增加互动和传播效果。 其次,投票源码uinapp具备强大的数据统计和分析功能,能够帮助活动组织者全面了解投票情况。组织者可以实时监控投票人数、投票走势等数据,从中分析用户喜好和参与度,为后续活动的优化提供参考。同时,投票源码uinapp还可以生成详细的数据报告和统计图表,帮助组织者更直观地了解活动效果。 另外,投票源码uinapp支持多种投票方式和设置,满足不同类型的活动需求。组织者可以根据活动要求设置投票规则、投票时间等,灵活调整活动策略。插件还提供了防作弊机制,确保投票结果的公正性和可信度。 总结来说,投票源码uinapp是一个功能强大、操作简便的投票插件,适用于公众号、小程序等平台的男神女神评选、拓客活动等。通过该插件,组织者可以轻松开展活动,用户也可以方便参与投票和分享。同时,插件还提供了强大的数据统计和分析功能,帮助组织者更好地了解活动情况并做出优化策略。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值