load data inpath的实践

有个需求,需要在hive表里面追加数据,数据是历史数据,是业务方给的一个csv文件,

思路,考虑使用load data inpath,把历史数据导入到hive表里面

步骤:

1、利用已经搭建好的大数据平台,把csv文件上传到HDFS, 我是直接利用大数据平台提供的页面上传功能,
如果没有这个条件的,可以使用命令行:hadoop fs -put xxx.csv /export, 指定一个目录,目录名为export

2、创建一个表,表里面的字段名称和字段顺序,跟csv文件保持一致,例如这个表名叫 infor.load_data_test;

3、load data inpath 'hdfs://xxxxxx/user/hive/warehouse/export/xxx.csv' overwrite into table infor.load_data_test;
其中'hdfs://xxxxxx/user/hive/warehouse/export/xxx.csv',是文件上传后,存放的位置

4、把临时表的数据写入到正式的hive表中

INSERT into infor.formal_data_table
select field1,field2, 1597456800000 as create_time from infor.load_data_test;

加一个时间戳,标识这批数据是什么时候写入的

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
"Scala: Guide for Data Science Professionals (Learning Path)" ASIN: B06XCJVY21, eISBN: 1787282856 | 2017 | True PDF | 1100 pages | 15 MB Scala will be a valuable tool to have on hand during your data science journey for everything from data cleaning to cutting-edge machine learning About This Book • Build data science and data engineering solutions with ease • An in-depth look at each stage of the data analysis process — from reading and collecting data to distributed analytics • Explore a broad variety of data processing, machine learning, and genetic algorithms through diagrams, mathematical formulations, and source code Who This Book Is For This learning path is perfect for those who are comfortable with Scala programming and now want to enter the field of data science. Some knowledge of statistics is expected. What You Will Learn • Transfer and filter tabular data to extract features for machine learning • Read, clean, transform, and write data to both SQL and NoSQL databases • Create Scala web applications that couple with JavaScript libraries such as D3 to create compelling interactive visualizations • Load data from HDFS and HIVE with ease • Run streaming and graph analytics in Spark for exploratory analysis • Bundle and scale up Spark jobs by deploying them into a variety of cluster managers • Build dynamic workflows for scientific computing • Leverage open source libraries to extract patterns from time series • Master probabilistic models for sequential data In Detail Scala is especially good for analyzing large sets of data as the scale of the task doesn't have any significant impact on performance. Scala's powerful functional libraries can interact with databases and build scalable frameworks — resulting in the creation of robust data pipelines. The first module introduces you to Scala libraries to ingest, store, manipulate, process, and visualize data. Using real world examples, you will learn how to design scalable architecture to process and model data — starting from simple concurrency constructs and progressing to actor systems and Apache Spark. After this, you will also learn how to build interactive visualizations with web frameworks. Once you have become familiar with all the tasks involved in data science, you will explore data analytics with Scala in the second module. You'll see how Scala can be used to make sense of data through easy to follow recipes. You will learn about Bokeh bindings for exploratory data analysis and quintessential machine learning with algorithms with Spark ML library. You'll get a sufficient understanding of Spark streaming, machine learning for streaming data, and Spark graphX. Armed with a firm understanding of data analysis, you will be ready to explore the most cutting-edge aspect of data science — machine learning. The final module teaches you the A to Z of machine learning with Scala. You'll explore Scala for dependency injections and implicits, which are used to write machine learning algorithms. You'll also explore machine learning topics such as clustering, dimentionality reduction, Naive Bayes, Regression models, SVMs, neural networks, and more. This learning path combines some of the best that Packt has to offer into one complete, curated package. It includes content from the following Packt products: • Scala for Data Science, Pascal Bugnion • Scala Data Analysis Cookbook, Arun Manivannan • Scala for Machine Learning, Patrick R. Nicolas Style and approach A complete package with all the information necessary to start building useful data engineering and data science solutions straight away. It contains a diverse set of recipes that cover the full spectrum of interesting data analysis tasks and will help you revolutionize your data analysis skills using Scala.
Practical instruction on using JavaScript Object Notation (JSON) with MySQL This hands-on guide teaches, step by step, how to use JavaScript Object Notation (JSON) with MySQL. Written by a MySQL Community Manager for Oracle, MySQL and JSON: A Practical Programming Guide shows how to quickly get started using JSON with MySQL and clearly explains the latest tools and functions. All content is based on the author’s years of interaction with MySQL professionals. Throughout, real-world examples and sample code guide you through the syntax and application of each method. You will get in-depth coverage of programming with the MySQL Document Store. •See how JavaScript Object Notation (JSON) works with MySQL •Use JSON as string data and JSON as a data type •Find the path, load data, and handle searches with REGEX •Work with JSON and non-JSON output •Build virtual generated columns and stored generated columns •Generate complex geometries using GeoJSON •Convert and manage data with JSON functions •Access JSON data, collections, and tables through MySQL Document Store 1 Introduction JSON MySQL The Example Database How to Use This Book 2 JSON as String Data vs. JSON as a Data Type JSON String Data The JSON Data Type 3 Finding the Path Examining the world_x Data Seeing the Keys Path Digging Deeper 4 Finding and Getting Data All Keys Searching for a Key Searching for a Path Searching for a Value 5 Changing Data Using Arrays Appending Arrays Inserting into an Array Using TRUNCATE Before Adding New Data Using JSON_INSERT Using JSON_REPLACE JSON_REMOVE JSON_SET JSON_UNQUOTE The Three JSON_MERGE Functions JSON_MERGE JSON_MERGE_PRESERVE JSON_DEPTH JSON_LENGTH JSON_TYPE JSON_VALID JSON_STORAGE_SIZE JSON_STORAGE_FREE 6 JSON and Non-JSON Output JSON-Formatted Data JSON_OBJECT JSON_ARRAY Casting Non-JSON Output Missing Data Nested Data 7 Generated Columns Using Generated Columns Columns Generated from JSON Generated Columns: Common Errors 8 GeoJSON ST_GeomFromGeoJSON ST_AsGeoJSON 9 PHP’s JSON Functions JSON_DECODE JSON_ENCODE 10 Loading JSON Data From Download to Database Step 1: Examine the Data Step 2: Create the Table Step 3: Load the Data Using a Wrapper Step 4: Double-Check the Data jq: JSON CLI Parser With No Arguments Select Certain Fields The Restaurant Collection 11 The MySQL Document Store The X DevAPI mysqlsh Connections Session Types Collections and Documents CRUD: Create, Replace, Update, Delete Filtering Find Sorting Binding Indexing Collections Dropping a Collection 12 Programming with the MySQL Document Store Programming Examples Python Example Node.JS Example PHP Example Traditional SQL vs. MySQL Document Store The MySQL Shell and JavaScript Relational Tables Both Relational and Document Document as Relational A Additional Resources Index

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值