计算机视觉和机器学习_我从计算机视觉和机器学习硕士课程中学到的东西

计算机视觉和机器学习

重点 (Top highlight)

经验 (Experience)

I wrote this article as a reflection of what I picked up from undertaking an MSc in Machine learning. I have to admit, some parts of my studies were useful, and some parts weren’t.

我写这篇文章是对我从机器学习硕士课程中学到的东西的反映。 我必须承认,我的某些学习内容很有用,而有些则没有用。

This article covers my experiences and course content, but I’m sure that courses at other University will not differ too significantly. Therefore, some readers could use this article as a window into what to expect from an MSc in Machine learning and Computer Vision.

Ť他的文章涵盖了我的经验和课程内容,但我敢肯定,在其他大学的课程也不会相差太多显著。 因此,某些读者可以将本文用作了解机器学习和计算机视觉理学硕士课程的窗口

Along with information as to what I learnt during my studies, I’ll also include information as to how relevant the gains academic knowledge is to my current job role as a Computer Vision Engineer.

除了有关我在学习期间学到的信息之外,我还将包括有关所获得的学术知识与我目前作为计算机视觉工程师的职位有多相关的信息。

前提条件 (Pre-requisite)

An advanced degree in machine learning has a few selected topics that reflect the direction of progression within the machine learning field.

机器学习的高级学位具有一些选定的主题,这些主题反映了机器学习领域内的发展方向。

There is so much to be covered in any given course in machine learning. Therefore the MSc degree I undertook had ensured students had the following pre-requisites before acceptance to the course.

机器学习的任何给定课程中都有很多内容要涵盖。 因此,我所修读的MSc学位确保学生在接受课程之前具备以下先决条件。

  • Good understanding of Linear Algebra and Calculus (Differentiation/Optimization)

    很好地理解线性代数和微积分(微分/优化)
  • Basic level Statistics and Probability studies

    基础水平的统计和概率研究
  • Background in a programming language

    编程语言背景
  • Undergraduate study in either Computer Science, Mathematics, Physics or Electronic and Mechanical Engineering

    计算机科学,数学,物理或电子与机械工程专业的本科学习

现在介绍从机器学习硕士那里学到的关键信息。 (Now on to key information I learnt from my Masters in Machine learning.)

1.计算机视觉 (1. Computer Vision)

Let me start with my strongest modules from the course.

让我从课程中最强的模块开始。

Computer Vision and Deep Learning studies is an area of machine learning that genuinely interests me. Perhaps I’m drawn to the field as a result of the direct impact developed techniques can have.

计算机视觉和深度学习研究是我真正感兴趣的机器学习领域。 也许由于开发技术可以产生直接影响而吸引了我。

Media outlets have sung praises of how far computer vision technology has progressed over the decades. The rapid advent of facial recognition systems is unmissable. It’s difficult to not find a facial recognition system at major international airports, banks and government organizations.

媒体赞扬了计算机视觉技术在过去几十年中取得长足的进步 。 面部识别系统的Swift出现是不容错过的。 在主要的国际机场,银行和政府组织中很难找到面部识别系统。

Studies in computer vision within my masters were very structured. You are not expected to jump straight into implementation and analyzing the state of the art techniques.

我的硕士对计算机视觉的研究非常有条理。 您不应直接进入实施和分析最新技术水平。

In fact, you took several steps back. You started with gaining knowledge of basic image processing techniques that were developed before the introduction of the computer vision advanced techniques we see and use today.

实际上,您向后退了几步。 您首先要获得对基本图像处理技术的了解,这些基本图像处理技术是在引入我们今天看到和使用的计算机视觉高级技术之前开发的。

Image for post
Photo by Gery Wibowo on Unsplash
Gery WibowoUnsplash上的 照片

Within Deep Learning, we understand that the lower levels of a convolutional neural network learn low-level patterns from the input image such as lines and edges.

在深度学习中,我们了解到卷积神经网络的较低层会从输入图像(例如线条和边缘)中学习较低层的模式。

But before the introduction of Convolutional Neural Networks(CNN) into computer vision, there were heuristic-based techniques used to detect regions of interests and extract features from images.

但是在将卷积神经网络(CNN)引入计算机视觉之前,有基于启发式的技术可用于检测兴趣区域并从图像中提取特征。

My computer vision studies ensured I understood the foundation of the field, by gaining some knowledge in how some of these heuristic-based techniques worked and are utilized in practical applications.

我的计算机视觉研究确保我了解该领域的基础,从而获得了一些基于启发式技术的工作原理以及如何在实际应用中加以利用的知识。

Computer vision studies provided me with the knowledge of traditional machine learning techniques to process images, extract features and classify descriptors obtained from images.

计算机视觉研究为我提供了传统机器学习技术的知识,以处理图像,提取特征并对从图像中获得的描述符进行分类。

Below are a few key topics and terms that were introduced during my computer vision studies:

以下是在我的计算机视觉研究期间引入的一些关键主题和术语:

Feel free to skip the definitions. I placed these here to provide some information for curious individuals…

随时跳过定义。 我在这里放置这些信息是为了为好奇的人提供一些信息。

  • (Scale Invariant Feature Transform)SIFT: This is a computer vision technique that is used to generate an image keypoint descriptor (feature vector). The generated descriptor contains information on features such as edges, corners and blobs. The descriptor can be used to detect objects across images of different scales and distortion. SIFT is utilized in applications such as object recognition, gesture recognition, and tracking. Here is a link to the original research paper that introduced the technique. The critical thing about SIFT is that its detected features are invariant to any affine transformation such as scaling, translation and rotation.

    (尺度不变特征变换)SIFT :这是一种计算机视觉技术,用于生成图像关键点描述符(特征向量)。 生成的描述符包含有关特征的信息,例如边缘,拐角和斑点。 描述符可用于检测跨不同比例和失真图像的对象。 SIFT用于诸如对象识别,手势识别和跟踪之类的应用程序中。 这是介绍该技术的原始研究论文的链接 。 SIFT的关键在于,其检测到的特征对于仿射变换(例如缩放,平移和旋转)是不变的。

  • (Histogram of Orientated Gradients)HOG: This is a technique used to extract features from an image. The extracted features are derived from the information provided through the edges and corners within the image, more specifically, objects within the image. A simple description of the techniques is that it identifies the location of edges(gradients), corners and lines within an image and also, obtain information in regards to the orientation of the edges. The HOG descriptor generates a histogram that contains information on the distribution of the edge and orientation information detected from an image. This technique can be found in computer vision applications and within image processing as well. Here’s a link with some more information

    HOG(定向梯度直方图)HOG :这是一种用于从图像中提取特征的技术。 提取的特征来自通过图像中的边缘和角落提供的信息,更具体地说,是图像中的对象。 对技术的简单描述是,它可以识别图像中的边缘(渐变),角和线的位置,并且还可以获取有关边缘方向的信息。 HOG描述符生成一个直方图,其中包含有关边缘分布的信息和从图像中检测到的方向信息。 这种技术可以在计算机视觉应用程序以及图像处理中找到。 这是更多信息的链接

  • Principal Component Analysis(PCA): An algorithm to reduce the dimensions of feature-rich datasets. Reduction of dimension is achieved by projecting data points from a higher dimension to a lower plane, but still maintaining information and minimizing information loss.

    主成分分析(PCA):一种减少特征丰富的数据集维的算法。 通过将数据点从较高的维度投影到较低的平面,但仍保持信息并最大程度地减少信息丢失,可以减小维度。

Other notable topics to mentions are as follows:

其他值得一提的主题如下:

  • Linear interpolation

    线性插值

  • Unsupervised Clustering (K-Means)

    无监督聚类(K均值)

  • Bag Of Visual Words(Visual search system)

    视觉单词袋(视觉搜索系统)

Very early on within my studies, I was expected to begin developing computer vision-based applications. Object classification is a popular topic and one of relative ease to gain some essential knowledge from and also implement.

早在我的研究中,我就被期望开始开发基于计算机视觉的应用程序。 对象分类是一个受欢迎的话题,也是从中获得一些基本知识并实现的相对容易的方法之一。

Within my studies, I was tasked with developing a visual search system in Matlab.

在我的研究中,我的任务是在Matlab中开发视觉搜索系统。

Matlab is a programming language that has been developed for efficient numerical computing and matrix manipulation, and the Matlab library is equipped with a suite of algorithms and visualization tools.

Matlab是为有效的数值计算和矩阵处理而开发的一种编程语言,并且Matlab库配备了一套算法和可视化工具。

Having past programming experience in JavaScript, Java, and Python helped me pick up Matlab programming syntax easily so that I could focus wholeheartedly on the computer vision aspect of the studies.

过去在JavaScript,Java和Python方面的编程经验帮助我轻松掌握了Matlab编程语法,因此我可以全神贯注于研究的计算机视觉方面。

更多信息… (More Information…)

The visual system to implement was rather basic, and it worked by passing a query image through the system, after that the system produces a set of results of images that are similar to the query image that was passed into the system.

要实现的视觉系统相当基础,它通过将查询图像传递给系统来工作,此后,系统将生成一组图像结果,这些结果与传递到系统中的查询图像相似。

I should mention that the system contained a database of stored images to extract result images from(query image in then result images out).

我应该提到,该系统包含一个存储的图像数据库,用于从中提取结果图像(先查询图像,然后输出结果图像 )。

The visual system didn’t use any fancy deep learning technique, but rather some of the traditional machine learning techniques mentioned earlier.

视觉系统没有使用任何花哨的深度学习技术,而是使用了前面提到的一些传统机器学习技术。

You simply pass an RGB image that is converted to grayscale and then a feature extractor is imposed on the image; thereafter, an image descriptor is extracted and represented on an N-dimensional feature space.

您只需传递转换为灰度的RGB图像,然后在图像上添加特征提取器即可; 此后,提取图像描述符并将其表示在N维特征空间上。

Within this feature space, you could work out similar images by calculating the euclidean distance between two N-dimensional points.

在此特征空间内,您可以通过计算两个N维点之间的欧式距离来得出相似的图像。

事情开始变得严重… (Things start getting serious…)

Understanding computer vision is not just limited to working with images; you are expected to utilize algorithms and techniques on videos.

了解计算机视觉不仅限于处理图像。 您应该在视频中使用算法和技术。

Remember that videos are just sequences of images, so you are not learning anything new in terms of the preparation and handling of input data.

请记住,视频只是图像序列,因此就准备和处理输入数据而言,您没有学到任何新东西。

Object tracking within a series of images seems very trivial if you are using an object detection framework such as YOLO, RCNN etc. But recognize that studying computer vision is not about using pre-trained networks and fine-tuning. It’s about understanding how the field itself has progressed over the years, and the best way to gain a solid understanding is by surveying the variety of traditional techniques that have been developed over time.

如果使用诸如YOLORCNN等对象检测框架,则在一系列图像中进行对象跟踪似乎非常琐碎。但是要认识到,研究计算机视觉并不是要使用预先训练的网络和微调。 这是关于了解该领域本身多年来的发展情况,而获得扎实了解的最佳方法是调查随着时间推移而发展起来的各种传统技术。

So for the task of object tracking, the following topics were introduced:

因此,针对对象跟踪的任务,引入了以下主题:

  • Blob trackers

    Blob追踪器

  • Kalman filters

    卡尔曼滤波器

  • Particle Filters

    粒子过滤器

  • Markov Process

    马尔可夫过程

与计算机视觉工程师的相关性 (Relevancy as a Computer Vision Engineer)

I’ll be honest, I haven’t yet utilized any traditional machine learning classifiers within my current role, and I don’t think I’ll be using any soon.

老实说,我目前尚未使用任何传统的机器学习分类器,而且我认为我不会很快使用它。

But to give you an idea of how relevant some of the mentioned techniques are it’s worth stating that self-driving cars, license plate readers and lane detectors incorporate one or two of the methods discussed earlier.

但是,为了让您了解其中提到的某些技术的相关性,值得指出的是,自动驾驶汽车,车牌读取器和车道检测器采用了前面讨论的一种或两种方法。

Image for post
Image for post
Left: Photo by Bram Van Oost on Unsplash. Right: Photo by Trent Szmolnik on Unsplash
左:Bram Van Oost在《 Unsplash》上的照片。 右: Trent SzmolnikUnsplash上的 照片

2.深度学习 (2. Deep Learning)

Deep learning is a natural progression from computer vision studies.

深度学习是计算机视觉研究的自然发展。

Some deep learning topics were already covered within the computer vision module, while other topics of deep learning were extensions or improvement to traditional computer vision techniques.

计算机视觉模块中已经涵盖了一些深度学习主题,而深度学习的其他主题是对传统计算机视觉技术的扩展或改进。

The teachings of topics in deep learning took a similar path to my computer vision studies, which is, the creation of a solid understanding of the fundamental of the field before moving to advanced topics and application development.

深度学习中的主题教学采用了与我的计算机视觉研究类似的方法,即在转向高级主题和应用程序开发之前,对领域的基础有了深入的了解。

Image for post
Image from kisina/Getty Images
图片来自kisina / Getty Images

Deep learning studies started from an understanding of the fundamental building block of images, pixels.

深度学习研究始于对图像,像素的基本构建块的理解。

You quickly gain the knowledge that a digital image is a grid that contains a collection of pixels.

您很快就会知道数字图像是包含像素集合的网格。

After an understanding of the atomic basis of images, you move on to learn how images are stored within system memory.

了解了图像的原子基础之后,您将继续学习如何将图像存储在系统内存中。

The framebuffer is the name given to the location of where pixels are stored within the system memory (not a lot of MOOCS will teach you this).

帧缓冲 是在系统内存中存储像素的位置的名称( 很多MOOCS都不会教您这一点 )。

Image for post
Photo by Gery Wibowo on Unsplash
Gery WibowoUnsplash上的 照片

I also gained knowledge of how camera devices actually capture digital images.

我还了解了相机设备如何实际捕获数字图像的知识。

I have to admit it was great having some intuition as to how a smartphone camera captures images.

我必须承认,对于智能手机相机如何捕获图像有一定的直觉是很棒的。

Let’s fast forward to some more cool stuff.

让我们继续前进一些更酷的东西。

You can’t learn Deep Learning without having an understanding of Convolutional Neural Networks(CNN), they go hand in hand.

在不了解卷积神经网络(CNN)的情况下,您将无法学习深度学习,它们是紧密相连的。

My studies introduced the timeline of the introduction and development of CNNs over the last 20 years (from LeNet-5 to RCNNs)and their role in replacing the traditional pipeline for typical computer vision tasks such as object recognition.

我的研究介绍了过去20年(从LeNet-5到RCNN)CNN引入和开发的时间表,以及它们在替代传统管道中完成诸如对象识别之类的典型计算机视觉任务的作用。

An exploration of different CNN architectures presented in the early days of deep learning was introduced during my studies. AlexNet, LeNet and GoogLeNet were case studies that were utilized to create an understanding of the internals of a convolutional neural network and their application to solving tasks such as object detection, recognition and classification.

在我的研究期间,我们介绍了深度学习早期提出的对不同CNN架构的探索。 AlexNetLeNetGoogLeNet是案例研究,用于对卷积神经网络的内部知识及其在解决诸如目标检测,识别和分类等任务中的应用的理解。

One important skill I learnt was how to read research papers.

我学到的一项重要技能是如何阅读研究论文。

Reading research papers was not a skill that you were directly taught. If you are serious about deep learning and studying in general, it’s always a good idea to go the source of information and research. It’s rather easy to utilize pretrained models offered by deep learning framework. Still, an advanced study expects you to know the intrinsic details of the techniques and components of each of the architecture presented, information that is only presented in research papers.

阅读研究论文并不是您直接学习的技能。 如果您对一般的深度学习和学习很认真,那么获取信息和研究资源总是一个好主意。 利用深度学习框架提供的预训练模型非常容易。 尽管如此,一项高级研究仍希望您了解所提出的每种体系结构的技术和组件的内在细节,这些信息仅在研究论文中提出。

Here’s a summary of topics that were covered in the deep learning module:

以下是深度学习模块中涵盖的主题的摘要:

Feel free to skip the definitions. I placed these here to provide some information for curious individuals…

随时跳过定义。 我在这里放置这些信息是为了为好奇的人提供一些信息。

  • Multiplayer Perceptron(MLP): A Multilayer perceptron (MLP) is several layers of perceptrons stacked consecutively one after the other. The MLP is composed of one input layer, and one or more layers of TLUs called hidden layers, and one final layer referred to as the output layer.

    多人感知器(MLP) :多层感知器(MLP)是几层感知器,一个接一个地连续堆叠。 MLP由一个输入层,一个或多个TLU层(称为隐藏层)和一个最后一层(称为输出层)组成。

  • Neural Style Transfer (NST): A technique that involves the utilization of deep convolutional neural network and algorithms to extract the content information from an image and the style information from another reference image. After the extraction of style and content, a combination image is generated where the content and style of the resulting image is sourced from different images.

    神经样式转移(NST):一种涉及利用深度卷积神经网络和算法从图像中提取内容信息并从另一幅参考图像中提取样式信息的技术。 在提取样式和内容之后,将生成一个组合图像,其中所得图像的内容和样式来自不同的图像。

  • Recurrent Neural Networks(RNN) and LSTM: A variant of neural network architecture that can accept as input with arbitrary sizes and produces output data with random sizes. RNN neural network architectures learn temporal relationships.

    递归神经网络(RNN)和LSTM :神经网络体系结构的一种变体,可以接受任意大小的输入作为输入,并生成随机大小的输出数据。 RNN神经网络架构学习时间关系。

  • Face Detection: A term given to the task of implementing systems that can automatically recognize and localize human faces in images and videos. Face detection is present in applications associated with facial recognition, photography, and motion capture.

    人脸检测 :这是实施系统任务的一个术语,该系统可以自动识别和定位图像和视频中的人脸。 在与面部识别,摄影和运动捕捉相关的应用程序中存在面部检测。

  • Pose Estimation: The process of deducing the location of the main joints of a body from provided digital assets such as images, videos, or a sequence of images. Forms of pose estimation are present in applications such as Action recognition, Human interactions, creation of assets for virtual reality and 3D graphics games, robotics and more.

    姿势估计 :从提供的数字资产(例如图像,视频或图像序列)中推断出人体主要关节位置的过程。 姿势估计的形式存在于诸如动作识别,人类交互,为虚拟现实和3D图形游戏创建资产,机器人技术等应用中。

  • Object Recognition: The process of identifying the class a target object is associated with. Object recognition and detection are techniques with similar end results and implementation approaches. Although the recognition process comes before the detection steps in various systems and algorithms.

    对象识别:识别与目标对象关联的类的过程。 对象识别和检测是具有相似最终结果和实现方法的技术。 尽管识别过程先于各种系统和算法的检测步骤。

  • Tracking: A method of identifying, detecting, and following an object of interest within a sequence of images over some time. Applications of tracking within systems are found in many surveillance cameras and traffic monitoring devices.

    跟踪:一种在一段时间内识别,检测和跟踪图像序列中的关注对象的方法。 在许多监控摄像机和交通监控设备中都可以找到系统内跟踪的应用。

  • Object Detection: Object detection is associated with Computer Vision and describes a system that can identify the presence and location of a desired object or body within an image. Do note that there can be singular or multiple occurrences of the object to be detected.

    对象检测 :对象检测与Computer Vision相关联,描述了一种可以识别图像中所需对象或物体的存在和位置的系统。 请注意,可能会出现单个或多个要检测的物体。

Other notable topics and sub-topics include neural networks, backpropagation, CNN network architectures, super-resolution, gesture recognition, semantic segmentation etc.

其他值得注意的主题和子主题包括神经网络,反向传播,CNN网络架构,超分辨率,手势识别,语义分割等。

与计算机视觉工程师的相关性 (Relevancy as a Computer Vision Engineer)

This is essentially my bread and butter.

这实质上是我的面包和黄油。

To date, I have incorporated face detection, gesture recognition, pose estimation and semantic segmentation models on edge devices for gaming purposes.

迄今为止,我已经出于游戏目的在边缘设备上合并了面部检测,手势识别,姿势估计和语义分割模型。

In my current role, I implement, train and evaluate a lot of deep learning models. If you would like to work with a lot cutting edge algorithms, tools and within progressive companies, then deep learning is a field that can put you at the forefront of actual commercial developments in AI.

在目前的职位上,我实施,培训和评估了许多深度学习模型。 如果您想在先进的公司中使用大量前沿算法,工具,那么深度学习就是一个可以使您站在AI实际商业开发的最前沿的领域。

3.论文 (3. Thesis)

A masters thesis aim is to enable you to utilize all the skills, knowledge and intuition you’ve gained during your studies towards devising a solution to a real-life based problem.

硕士学位论文的目的是使您能够利用在学习过程中获得的所有技能,知识和直觉,为基于现实生活的问题设计解决方案。

My thesis was based on the utilization of computer vision techniques for conducting motion analysis on quadrupeds(four-legged animals). The key computer vision technique that I used to perform motion analysis was pose-estimation.

我的论文是基于计算机视觉技术对四足动物(四足动物)进行运动分析的。 我用来进行运动分析的关键计算机视觉技术是 姿势估计。

This was one of the first times I was introduced to the world of deep learning frameworks. I had already decided that my solution to motion analysis was going to be a deep learning-based solution that leveraged convolutional neural networks.

这是我第一次被引入深度学习框架领域。 我已经决定,我对运动分析的解决方案将是利用卷积神经网络的基于深度学习的解决方案。

For the selection of the framework, I went back and forth between Caffe and Keras, but I settled for PyTorch due to its readily available pretrained models that were relevant to the task. Python was my programming language of choice.

为了选择框架,我在Caffe和Keras之间来回穿梭,但是由于PyTorch具有与任务相关的随时可用的预训练模型,因此我选择了PyTorch。 Python是我选择的编程语言。

Image for post
Image for post
PyTorch. Right: PyTorch 。 右: Python Python

这是我从论文中学到的项目列表: (Here’s a list of items I learnt as a result of my thesis:)

  • Transfer learning / Fine Tuning

    转移学习/微调
  • Python Programming language

    Python程式设计语言
  • C# Programming language

    C#编程语言
  • Theory knowledge on pose estimation

    姿势估计的理论知识
  • Knowledge on how to use Unity3D for simulations

    有关如何使用Unity3D进行仿真的知识
  • Experience using Google Cloud Platform

    使用Google Cloud Platform的经验

有关运动分析的更多信息 (More Information On Motion Analysis)

Motion analysis is the term given to the process involved in obtaining movement information and details from clear moving pictures or collation of images representing sequence to sequence depiction of locomotion.

运动分析是指从清晰的运动图片或代表序列的运动序列到序列图像的整理中获取运动信息和细节所涉及的过程的术语。

The results of applications and operations that utilize motion analysis are in their most straightforward form details around motion detection and keypoints localization. Complex applications enable the utilization of sequential related images to track objects on a frame by frame basis.

利用运动分析的应用程序和操作的结果以最直接的形式详细介绍了运动检测和关键点定位。 复杂的应用程序可以利用顺序相关的图像逐帧跟踪对象。

In the present day, motion analysis and its various application forms provide significant benefits and rich information when utilized on temporal data.

如今,运动分析及其各种应用形式在时态数据上使用时可提供显着的好处和丰富的信息。

Different industries benefit from the results and information provided through motion analysis, sectors such as healthcare, manufacturing, mechanical, finance, etc. have various use cases and method of application of motion analysis to solve problems or create value for consumers.

不同行业受益于通过运动分析提供的结果和信息,诸如医疗保健,制造,机械,金融等行业具有各种用例和应用运动分析的方法来解决问题或为消费者创造价值。

The diversity of how motion analysis is utilized across the industry has indirectly introduced various arching subsets of motion analysis such as pose estimation, object detection, object tracking, keypoint detection, and different other subsets.

整个行业如何利用运动分析的多样性已经间接引入了运动分析的各种弓形子集,例如姿势估计,对象检测,对象跟踪,关键点检测以及其他不同子集。

有关论文的更多信息 (More Information On The Thesis)

The thesis presented an approach to motion analysis conducted utilizing computer vision and machine learning techniques. In the presented approach, a dataset of synthetic quadruped images was used to train a pre-trained keypoint detection network.

论文提出了一种利用计算机视觉和机器学习技术进行运动分析的方法。 在提出的方法中,使用合成的四足动物图像数据集来训练预训练的关键点检测网络。

Keypoint-RCNN is a built-in model within the Pytorch Library that extends the capabilities of the original Fast-RCNN and Faster-RCNN. The approach in the thesis modified the Keypoint-RCNN neural network architecture pre-trained on COCO 2017 object detection and segmentation dataset and retrained the last layer with a synthetically generated dataset.

Keypoint-RCNN是Pytorch库中的内置模型,扩展了原始Fast-RCNNFaster-RCNN的功能 。 本文中的方法修改了在COCO 2017对象检测和分割数据集上预先训练的Keypoint-RCNN神经网络架构,并使用合成生成的数据集对最后一层进行了训练。

By expanding a baseline framework for keypoint detection of humans with 17 joints across a body, I presented an extension of the framework, to one that predicts major joints location on several generated quadrupeds with 26 joints.

通过扩展用于在人体上进行17个关节的人体关键点检测的基准框架,我提出了该框架的扩展,该框架可以预测在26个关节产生的多个四足动物上主要关节的位置。

Image for post
A snippet of thesis results
论文结果片段

Quantitative and qualitative evaluation strategies were used to show the visual and metric performance of the modified Keypoint-RCNN architecture in predicting key points on an artificial quadruped.

定性和定量评估策略用于显示改进的Keypoint-RCNN体系结构在人工四足动物上预测关键点时的视觉和度量性能。

If you’ve made it this far I applaud you…let bring this article to a close

如果您做到了这一点,我为您鼓掌...让本文结束

结论 (Conclusion)

The machine learning field is changing rapidly; the content of my course was relevant for the year 2018–2019. And already in 2020, we’ve seen massive contributions to several other areas of machine learning. So, don’t be surprised if you take a machine learning course and you are learning topics or subject areas I haven’t mentioned in this article.

机器学习领域正在Swift变化。 我的课程内容与2018-2019年有关。 在2020年,我们已经在机器学习的其他几个领域做出了巨大贡献。 因此,如果您学习机器学习课程并且正在学习本文中未提到的主题或主题领域,请不要感到惊讶。

And don’t forget, in AI it’s not just the models you create that have to learn, you as a machine learning practitioner have to keep up with new research, so keep learning.

别忘了,在AI中,不仅要学习创建的模型,作为机器学习从业人员,您还必须跟上新研究的步伐,所以要继续学习。

我希望您觉得这篇文章有用。 (I hope you found the article useful.)

To connect with me or find more content similar to this article, do the following:

要与我联系或查找更多类似于本文的内容,请执行以下操作:

  1. Subscribe to my YouTube channel for video contents coming soon here

    订阅我的YouTube频道以获取即将在这里 播出的视频内容

  2. Follow me on Medium

    跟我来

  3. Connect and reach me on LinkedIn

    LinkedIn上联系并联系我

翻译自: https://towardsdatascience.com/what-i-learnt-from-taking-a-masters-in-computer-vision-and-machine-learning-69f0c6dfe9df

计算机视觉和机器学习

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
├─1.计算机视觉简介、环境准备(python, ipython) │ computer vsion.pdf │ CS231 introduction.pdf │ ├─2.图像分类问题简介、kNN分类器、线性分类器、模型选择 │ 2. 图像分类简介、kNN与线性分类器、模型选择.mp4 │ 2.初识图像分类.pdf │ ├─3.再谈线性分类器 │ 3.再谈线性分类器.mp4 │ 再谈线性分类器.pdf │ ├─4.反向传播算法和神经网络简介 │ .反向传播算法和神经网络简介.pdf │ 4. 反向传播算法和神经网络简介.mp4 │ ├─5.神经网络训练1 │ 5.-神经网络训练1.pdf │ 5.神经网络训练1.mp4 │ ├─6.神经网络训练2、卷积神经网络简介 │ 6.神经网络训练2.mp4 │ 神经网络训练2.pdf │ ├─7.卷积神经网络 │ 7.卷积神经网络.mp4 │ Lession7.pdf │ ├─8.图像OCR技术的回顾、进展及应用前景 │ 8.图像OCR技术的回顾、进展及应用前景.mp4 │ PhotoOCR_xbai.pdf │ └─9.物体定位检测 物体定位检测.pdf │ ├─10.卷积神经网络可视化 │ .卷积神经网络可视化.pdf │ 10.卷积神经网络可视化.mp4 │ ├─11.循环神经网络及其应用 │ 11.循环神经网络及其应用.mp4 │ 循环神经网络.pdf │ ├─12.卷积神经网络实战 │ 12.卷积神经网络训练实战.mp4 │ 卷积神经网络实战.pdf │ ├─13.常见深度学习框架介绍 │ 常见深度学习框架介绍.pdf │ ├─14.图像切割 │ 14.图像切割.mp4

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值