CS106l assignment 1 wiki

%d%d2

已于 2024-01-20 11:01:39 修改

阅读量473

点赞数 10

分类专栏： CS106 文章标签：开发语言

于 2024-01-09 16:33:43 首次发布

本文链接：https://blog.csdn.net/2301_79140115/article/details/135476600

版权

CS106 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Part A语法介绍：

`1.`

`std::search`是C++中的一个函数，用于在两个迭代器指定的范围内搜索子序列。它是C++标准库的`<algorithm>`头文件的一部分

template< class ForwardIterator1, class ForwardIterator2 >
ForwardIterator1 search(ForwardIterator1 first1, ForwardIterator1 last1,
                        ForwardIterator2 first2, ForwardIterator2 last2);

主要参数

first1，last1：输入迭代器定义搜索范围。
first2，last2：定义要搜索的元素范围的输入迭代器。

返回值：

1.它返回一个迭代器，
指向范围[first2, last2)内子序列[first1, last1)第一次出现的第一个元素。
2.如果未找到子序列，则返回last1。（即搜索范围的最后一个值）

范例：

#include <iostream>
#include <algorithm>
#include <vector>

int main() {
    std::vector<int> vec = {1, 2, 3, 4, 5, 6, 7, 8, 9};
    std::vector<int> subseq = {4, 5, 6};

    auto result = std::search(vec.begin(), vec.end(), subseq.begin(), subseq.end());

    if (result != vec.end()) {
        std::cout << "Subsequence found at position: " << std::distance(vec.begin(), result) << std::endl;
    } else {
        std::cout << "Subsequence not found." << std::endl;
    }

    return 0;
}

`2.`

`std::find` 是 C++ 标准库中的一个函数，

用于在容器（比如 `std::vector`, `std::list`, `std::array` 等）或者数组中查找特定值的位置。

它的功能是在给定范围内搜索指定的值，并返回指向该值的迭代器

template< class InputIt, class T >
InputIt find( InputIt first, InputIt last, const T& value );

主要参数

first 和 last 是表示要搜索范围的迭代器。
first 表示要搜索的起始位置，而 last 表示要搜索的结束位置（不包括 last 自身）。
value 是要查找的特定值。

返回值：

`1.std::find` 函数的返回值是一个迭代器，

它指向找到的第一个匹配元素的位置。

2.如果未找到匹配元素，`std::find` 将返回指向范围末尾的迭代器（即 `last` 参数所指向的位置）。

#include <iostream>
#include <algorithm>
#include <vector>

int main() {
    std::vector<int> vec = {1, 2, 3, 4, 5};

    // 在向量中查找值为 3 的元素
    auto it = std::find(vec.begin(), vec.end(), 3);

    if (it != vec.end()) {
        std::cout << "Found value 3 at index: " << std::distance(vec.begin(), it) << std::endl;
    } else {
        std::cout << "Value 3 not found in the vector." << std::endl;
    }

    return 0;
}

还可以用 . 运算符，

即target_set.find(item) （此时已经规定范围为 target_set ）

3.当使用 `throw std::invalid_argument` 时，

它通常用于抛出一个异常来指示函数或程序发生了无效参数的情况。在循环中使用这个异常来退出循环是一种方式，但要确保在捕获异常的地方对其进行处理，否则程序会中止

4.c++ lambda匿名表达式

[capture clause](parameters) -> return_type { 
    // Lambda body (function implementation)
    // It can use captured variables, take parameters, and return a value
    // e.g., return some_value;
};

其中的各部分含义如下：

capture clause：用于捕获外部变量的列表，可以为空或包含变量。
parameters：lambda 函数的参数列表，类似于普通函
return_type：lambda 函数的返回类型。可以省略（由编译器推断）或使用 auto 关键字。

示例：

#include <iostream>

int main() {
    // Lambda 表达式示例：一个接受两个参数并返回它们的和的函数
    auto sum = [](int a, int b) { return a + b; };

    // 使用 Lambda 表达式计算并输出结果
    int result = sum(5, 3);
    std::cout << "Sum: " << result << std::endl;

    return 0;
}

在 Lambda 表达式中`[ ]` 来获取外部变量取决于你在 Lambda 函数体内是否需要使用

有三种

值捕获 [=]：
- 捕获外部作用域
- 示例：[=] 捕获所有外部变量。
引用捕获 [&]：
- 以引用方式捕获外部
- 示例：[&] 引用所有
特定变量捕获 [var1, var2]：
- 可以指定需要捕获的特
- 示例：[var1, &var2] 值捕获 var1，引用捕获 `var2。

int x = 5;
auto lambda_without_capture = []() {
    // Lambda 函数体内未使用外部变量，无需捕获
    return 42;
};

auto lambda_with_capture = [=]() {
    // 在 Lambda 函数体内使用了外部变量 x，需要值捕获
    return x + 10;
};

auto lambda_with_reference_capture = [&]() {
    // 在 Lambda 函数体内使用了外部变量 x，需要引用捕获
    x = 100; // 修改了外部变量 x 的值
    return x + 10;
};

5. `std::all_of`是C++标准库提供的一个算法。它对由范围定义的元素序列进行操作

template <class InputIt, class UnaryPredicate>
bool all_of(InputIt first, InputIt last, UnaryPredicate p);

first，last：定义要检查的范围的迭代器。
p：一元谓词函数或函数对象，定义要检查的条件

功能：

std::all_of对于要检查的范围的迭代器中的所有元素都是由一元函数p指定的条件是否true
如果范围内的所有元素都满足条件，则返回true，否则返回false。
如果范围为空（first == last），则函数返回`trshi

示例：

#include <algorithm>
#include <vector>

bool isOdd(int num) {
    return num % 2 != 0;
}

int main() {
    std::vector<int> numbers = {1, 3, 5, 7, 9};

    // Check if all elements in the vector are odd using a lambda function
    bool allAreOdd = std::all_of(numbers.begin(), numbers.end(), [](int n) { return n % 2 != 0; });

    // Alternatively, using a named function
    bool allAreOdd2 = std::all_of(numbers.begin(), numbers.end(), isOdd);

    return 0;
}

正确代码：

#include <iostream>
#include <algorithm>
#include <unordered_set>
#include <stdexcept>
#include <unordered_map>
#include "wikiscraper.h"
#include "error.h"

using std::cout;            using std::endl;
using std::cerr;            using std::string;
using std::unordered_map;   using std::unordered_set;

/*
 * You should delete the code in this function and
 * fill it in with your code from part A of the assignment.
 *
 * If you used any helper functions, just put them above this function.
 */

// TODO: ASSIGNMENT 2 TASK 4:
// Please implement a function that can determine if a wiki link is valid or not.
// As a reminder, it needs to take in a string and return whether or not 
// # or : is contained in the string.
// Estimated length: ~5-10 lines

///
// BEGIN STUDENT CODE HERE
bool valid_wikilink(const string& link) {
    // replace these lines!
    return std::all_of(link.begin(),link.end(),[](const auto & item){
        if(item== '#' || item == ':'){
            return false;}
        return true;
    });
}
// END STUDENT CODE HERE
///

unordered_set<string> findWikiLinks(const string& inp) {
    /* Delimiter for start of a link  */
    static const string delim = "href=\"/wiki/";

    unordered_set<string> ret;

    auto url_start = inp.begin();
    auto end = inp.end();

    while(true) {

        // TODO: ASSIGNMENT 2 TASK 1:
        // Set url_start to the next location of "delim" (starting your search at url_start), using std::search.
        // After doing so, break out of the while loop if there are no occurrences of delim left
        // (use your work from the line above).
        // Estimated length: 2-3 lines
        ///
        // BEGIN STUDENT CODE HERE
        
        // 利用std::search查找url中是否包含delim
        // 包含即返回搜索范围最后一个元素
        url_start =std::search(url_start,end,delim.begin(),delim.end()); 
        if (url_start == end)
            //它会抛出一个带有消息“Not implemented yet.\n”的std::invalid_argument异常
            break;
        
        // END STUDENT CODE HERE
        ///

        // TODO: ASSIGNMENT 2 TASK 2:
        // Set url_end to the end of the wikilink. Start searching after the delimeter you found above.
        // Make sure to use std::find! (std::find looks for a single element in a container, e.g. character in 
        // a string—std::search looks for a series of elements in a container, like a substring in a string. 
        // remember that a string is represented as an array of characters, and is also a container!)
        // Estimated length: 1 lines

        ///
        auto url_end = std::find(url_start+delim.length(),end,'\"');
        // END STUDENT CODE HERE
        ///

        // TODO: ASSIGNMENT 2 TASK 3:
        // Last exercise of this function! Create a string from the two iterators (url_start and url_end) above
        // using a string constructor. Make sure you start the string AFTER the delimiter you found in task 5!
        // Estimated length: 1 lines
        
        ///
        // BEGIN STUDENT CODE HERE (delete/edit this line)
        string link;
        link.assign(url_start+delim.length(),url_end);
        // END STUDENT CODE HERE
        ///

        /*
         * Only add link to the set if it is valid i.e. doesn't
         * contain a ':' or a '#'.
         */
        if(valid_wikilink(link)){
            ret.insert(link);
        }

        url_start = url_end;

    }
    return ret;

}


/*
 * ==================================================================================
 * |                Don't edit anything below here, but take a peek!                |
 * ==================================================================================
 */
unordered_set<string> WikiScraper::getLinkSet(const string& page_name) {
    if(linkset_cache.find(page_name) == linkset_cache.end()) {
        auto links = findWikiLinks(getPageSource(page_name));
        linkset_cache[page_name] = links;
    }
    return linkset_cache[page_name];
}


WikiScraper::WikiScraper() {
    (void)getPageSource("Main_Page");
}


string createPageUrl(const string& page_name) {
    return "https://en.wikipedia.org/wiki/" + page_name;
}

void notFoundError(const string& msg, const string& page_name, const string& url) {
    const string title = "    AN ERROR OCCURED DURING EXECUTION.    ";
    const string border(title.size() + 4, '*');
    cerr << endl;
    errorPrint(border);
    errorPrint("* " + title + " *");
    errorPrint(border);
    errorPrint();
    errorPrint("Reason: " + msg);
    errorPrint();
    errorPrint("Debug Information:");
    errorPrint();
    errorPrint("\t- Input parameter: " + page_name);
    errorPrint("\t- Attempted url: " + url);
    errorPrint();
}

string WikiScraper::getPageSource(const string &page_name) {
    const static string not_found = "Wikipedia does not have an article with this exact name.";
    if(page_cache.find(page_name) == page_cache.end()) {
        string url = createPageUrl(page_name);
        // using the cpr library to get the HTML content of a webpage!
        // we do so by aking a GET REST request to a wikipedia webpage, which
        // returns the content of the webpage. when this assignment was on QtCreator,
        // we had a whole separate assignment for making sure an alternate Internet Library
        // (not cpr) was working on your personal pc. look how simple it is now!
        cpr::Response r = cpr::Get(cpr::Url{url});

        string ret = r.text;
        if (r.status_code != 200) {
            notFoundError("Couldn't get page source. Have you entered a valid link?", page_name, url);
            return "";
        }
        if(std::search(ret.begin(), ret.end(), not_found.begin(), not_found.end()) != ret.end()){
            notFoundError("Page does not exist!", page_name, url);
            return "";
        }
        size_t indx = ret.find("plainlinks hlist navbar mini");
        if(indx != string::npos) {
            return ret.substr(0, indx);
        }
        page_cache[page_name] = ret;
    }
    return page_cache[page_name];
}

运行情况：

Running test: ./build/test 1
Running test: ./build/test 2
Running test: ./build/test 3
Running test: ./build/test 4
Running test: ./build/test 5
Running test: ./build/test 6
Running test: ./build/test 7
Running test: ./build/test 8
All 8 tests passed!

PartB:

需要掌握的语法：

1. `std::count_if`是由C++标准库提供的一个函数，特别是在`<algorithm>`头文件中。它是算法库的一部分，

用于对满足特定条件的范围内的元素进行计数

模板：

template <class InputIt, class UnaryPredicate>
typename iterator_traits<InputIt>::difference_type
count_if(InputIt first, InputIt last, UnaryPredicate p);

主要参数

first和last定义了要检查的元素的范围。
p是一个一元谓词函数或函数对象，它确定元素要满足的条件。

返回值：

（该函数返回范围`[first, last)`中谓词`p`为真的元素的数量）

在 C++ 中，`decltype`是一个说明符，

当应用于表达式时，会生成该表达式的类型。当您想要声明与现有表达式类型相同的变量或函数时，它非常有用。

// Example 1: Using decltype with a variable
int x = 42;
decltype(x) y = 10;  // y has the same type as x, which is int

// Example 2: Using decltype with an expression
float a = 3.14;
float b = 2.718;
decltype(a + b) result = a + b;  // result has the same type as the expression a + b, which is float

// Example 3: Using decltype with a function
bool compare(int a, int b) {
    return a < b;
}
decltype(compare) *ptr = compare;  // ptr is a pointer to the type of the function compare

代码（感觉代码逻辑是正确的，但是测试总是超时，

可能是他做的网络爬虫的问题）：

#include <iostream>     // for cout, cin
#include <fstream>      // for ifstream
#include <sstream>      // for stringstream
#include <filesystem>   	// making inputting files easier
#include <stdexcept>
#include <unordered_set>
#include <vector>
#include <queue>
#include <unordered_map>
#include "wikiscraper.h"

using std::cout;            using std::endl;
using std::ifstream;        using std::stringstream;
using std::string;          using std::vector;
using std::priority_queue;  using std::unordered_map;
using std::unordered_set;   using std::cin;

/*
 * This is the function you will be implementing parts of. It takes
 * two string representing the names of a start_page and
 * end_page and is supposed to return a ladder, represented
 * as a vector<string>, of links that can be followed from
 * start_page to get to the end_page.
 *
 * For the purposes of this algorithm, the "name" of a Wikipedia
 * page is what shows at the end of the URL when you visit that page
 * in your web browser. For ex. the name of the Stanford University
 * Wikipedia page is "Stanford_University" since the URL that shows
 * in your browser when you visit this page is:
 *
 *       https://en.wikipedia.org/wiki/Stanford_University
 */

// TODO: ASSIGNMENT 2 TASK 5:
// Please implement the following function, which should take in two sets of strings
// and returns the number of common strings between the two sets. You should use 
// lambdas and std::count_if.
// Estimated length: <4 lines

///
// BEGIN STUDENT CODE HERE
int numCommonLinks(const unordered_set<string>& curr_set, const unordered_set<string>& target_set) {
    // count_if 返回范围内满足条件的数量
    return std::count_if(curr_set.begin(),curr_set.end(),[&target_set](const string & item){
        // find函数若找不见，返回最后一个元素
        return target_set.find(item) != target_set.end();
    }); 
}
// END STUDENT CODE HERE
///

vector<string> findWikiLadder(const string& start_page, const string& end_page) {
    WikiScraper w;

    /* Create alias for container backing priority_queue */
    using container = vector<vector<string>>;
    // getlinkset 是一个hashmap page_name(string) -> link(unordered_set)
    unordered_set<string> target_set = w.getLinkSet(end_page);
    

    // TODO: ASSIGNMENT 2 TASK 6:
    // Please implement the comparator function that will be used in the priority queue.
    // You'll need to consider what variables this lambda will need to capture, as well as
    // what parameters it'll take in. Be sure to use the function you implemented in Task 1!
    // Estimated length: <3 lines
    
    ///
    // BEGIN STUDENT CODE HERE

    // left right 均是queue（排列）
    auto cmp_fn = [&w, &target_set](const vector<string>& left, const vector<string>& right) {
        // left.back()是最后指向的东西  eg. Fruit → Strawberry
        // getlinkset 是一个hashmap page_name(string) -> link(unordered_set)
        int left_num=numCommonLinks(w.getLinkSet(left.back()),target_set);
        int right_num=numCommonLinks(w.getLinkSet(right.back()),target_set);
        return left_num < right_num ;
    };
    // END STUDENT CODE HERE
    ///

    // TODO: ASSIGNMENT 2 TASK 7:
    // Last exercise! please instantiate the priority queue for this algorithm, called "queue". Be sure 
    // to use your work from Task 2, cmp_fn, to instantiate our queue. 
    // Estimated length: 1 line
    
    ///
    // BEGIN STUDENT CODE HERE
    // something like priority_queue<...> queue(...);
    // please delete ALL 4 of these lines! they are here just for the code to compile.
    
    //template<   class T,    class Container = std::vector<T>,       class Compare = std::less<typename Container::value_type>       >
    //class priority_queue;
    
    std::priority_queue<vector<string>, vector<vector<string>>, decltype(cmp_fn)> queue(cmp_fn);
    // END STUDENT CODE HERE
    ///

    queue.push({start_page});
    unordered_set<string> visited;

    while(!queue.empty()) {
        vector<string> curr_path = queue.top();
        queue.pop();
        string curr = curr_path.back();

        auto link_set = w.getLinkSet(curr);

        /*
         * Early check for whether we have found a ladder.
         * By doing this check up here we spead up the code because
         * we don't enqueue every link on this page if the target page
         * is in the links of this set.
         */
        if(link_set.find(end_page) != link_set.end()) {
            curr_path.push_back(end_page);
            return curr_path;
        }

        for(const string& neighbour : link_set) {
            if(visited.find(neighbour) == visited.end()) {
                visited.insert(neighbour);
                vector<string> new_path = curr_path;
                new_path.push_back(neighbour);
                queue.push(new_path);
            }
        }
    }
    return {};
}

int main() {
    // a quick working directory fix to allow for easier filename inputs
    auto path = std::filesystem::current_path() / "res/";
    std::filesystem::current_path(path);
    std::string filenames = "Available input files: ";

    for (const auto& entry : std::filesystem::directory_iterator(path)) {
        std::string filename = entry.path().string();
        filename = filename.substr(filename.rfind("/") + 1);
        filenames += filename + ", ";
    }
    // omit last ", ".
    cout << filenames.substr(0, filenames.size() - 2) << "." << endl;

    /* Container to store the found ladders in */
    vector<vector<string>> outputLadders;

    cout << "Enter a file name: ";
    string filename;
    getline(cin, filename);

    ifstream in(filename);
    int numPairs;
    // parse the first line as the number of tokens
    in >> numPairs;

    // loop through each line, parsing out page names and calling findWikiLadder
    string startPage, endPage;
    for (int i = 0; i < numPairs; i++) {
        // parse the start and end page from each line
        in >> startPage >> endPage;
        outputLadders.push_back(findWikiLadder(startPage, endPage));
    }

    /*
     * Print out all ladders in outputLadders.
     * We've already implemented this for you!
     */
    for (auto& ladder : outputLadders) {
        if(ladder.empty()) {
            cout << "No ladder found!" << endl;
        } else {
            cout << "Ladder found:" << endl;
            cout << "\t" << "{";

            std::copy(ladder.begin(), ladder.end() - 1,
                      std::ostream_iterator<string>(cout, ", "));
            /*
             * The above is an alternate way to print to cout using the
             * STL algorithms library and iterators. This is equivalent to:
             *    for (size_t i = 0; i < ladder.size() - 1; ++i) {
             *        cout << ladder[i] << ", ";
             *    }
             */
            cout << ladder.back() << "}" << endl;
        }
    }
    return 0;
}

assignment1 wiki作业页面 html代码：

【免费】cs106L-assignment1页面html代码资源-CSDN文库

%d%d2

关注

10
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
CS106l assignment 1 wiki

firstlast：定义要检查的范围的迭代器。p：一元谓词函数或函数对象，定义要检查的条件对于要检查的范围的迭代器中的所有元素都是由一元函数p指定的条件是否true如果范围内的所有元素都满足条件，则返回true，否则返回false。如果范围为空（），则函数返回`trshi。
复制链接

扫一扫