java+selenium爬取网站资源并下载到本地

最新推荐文章于 2023-09-07 12:20:13 发布

weixin_45692892

最新推荐文章于 2023-09-07 12:20:13 发布

阅读量469

点赞数

文章标签： java python 爬虫

本文链接：https://blog.csdn.net/weixin_45692892/article/details/117369803

版权

java+selenium爬取网站资源并下载到本地

最近在学习java爬虫写了个爬取网站歌曲的小demo记录一下

 public static void main(String[] args) throws InterruptedException, MalformedURLException {
        System.setProperty("webdriver.chrome.driver", "src/main/resources/chromeDriverPage/chromedriver.exe");  // 导入selenium驱动
        ChromeOptions chromeOptions = new ChromeOptions();
        chromeOptions.addArguments("headless"); //隐藏窗口 也可以不隐藏 因为是下载 所以就隐藏了
        WebDriver webDriver = new ChromeDriver(chromeOptions);
        webDriver.get("https://334.kim/"); //打开网站
        Thread.sleep(1000); //等待1秒
        List<WebElement> elements = webDriver.findElements(By.xpath("//a"));  //爬取a标签链接
        for (WebElement element : elements) {   //循环遍历
            String hrefsrc = element.getAttribute("hrefsrc");  //获得a标签链接的hrefsrc属性值
            if(hrefsrc!=null){
                URL url = new URL(hrefsrc);   //获得了音频的路径 用URL打开
                InputStream inputStream = null;  
                try {
                    inputStream = url.openStream();  //拿到音频输出流 
                    byte[] bytes = IOUtils.toByteArray(inputStream);
                    File file = new File("src/main/resources/music/" + hrefsrc.split(" ")[1]);  //截取文件名称
                    file.mkdirs();
                    FileOutputStream fileOutputStream = new FileOutputStream(file);  //写入本地
                    fileOutputStream.write(bytes);
                    fileOutputStream.close();
                }catch (Exception e){
                    System.out.println("下载失败");
                }
            }
        }
    }

weixin_45692892

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java+selenium爬取网站资源并下载到本地

java+selenium爬取网站资源并下载到本地最近在学习java爬虫写了个爬取网站歌曲的小demo记录一下 public static void main(String[] args) throws InterruptedException, MalformedURLException { System.setProperty("webdriver.chrome.driver", "src/main/resources/chromeDriverPage/chromedriver.e
复制链接

扫一扫