![](https://img-blog.csdnimg.cn/20201014180756928.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
DB
泥鳅812
这个作者很懒,什么都没留下…
展开
-
关于sqlalchemy读写mysql密码含有【@】、中文读写、迭代器读写、数据精度
class Database: def __init__(self): self.save_schema = 'dwd' self.save_table = 'dwd_xxx' self.mysql_dwd_config = { 'drivername': 'mysql+pymysql', 'username': 'nx_user_a', 'password': 'xxx@#$xx原创 2021-12-23 14:58:17 · 930 阅读 · 0 评论 -
docker和redis
# 配置目录redis_dir=~/Install/redis-6.2.3# 配置文件vim $redis_dir/redis.conf# 启动服务nohup $redis_dir/src/redis-server $redis_dir/redis.conf &# 启动客户端$redis_dir/src/redis-cli -a password原创 2021-06-01 15:07:23 · 140 阅读 · 0 评论 -
elasticsearch简单操作
es操作查看索引:curl -X GET 'localhost:9200/_cat/indices?v&pretty'查看节点:curl -X GET 'localhost:9200/_cat/nodes?v&pretty'创建索引:curl -X PUT 'localhost:9200/customer?pretty&pretty'删除索引curl -X DELETE 'localhost:9200/customer?pretty&pretty'.原创 2021-02-03 11:05:55 · 118 阅读 · 0 评论 -
sql 标准表头信息
-- =========================================================================-- **创建人: xxx xxx@xx.com.cn-- **创建时间: 201x-0x-xx-- **代码描述: xxxx宽表 开发人:xxx@xx.com.cn-- **涉及需求:-- **-- **-- **维护人: xxx...原创 2018-08-03 17:36:26 · 2022 阅读 · 0 评论 -
hive 日期转日期
select from_unixtime(unix_timestamp('20180801','yyyyMMdd'),'yyyy-MM-dd');#2018-08-01原创 2018-08-01 15:30:00 · 1400 阅读 · 0 评论 -
pyspark,pandas,sql之分组排序
import pyspark.sql.functions as fnfrom pyspark.sql import Windowdf.withColumn("row_number", fn.row_number().over(Window.partitionBy("id").orderBy(df["pt"].desc()))).show()原创 2018-12-19 15:57:01 · 3449 阅读 · 0 评论 -
pyspark&pandas之字符串筛选dataframe
#pandas:import numpy as npimport pandas as pddf = pd.DataFrame(np.array([['banana',1],['apple',2],['pear',3]]).reshape(3,2))df.columns = ['a','b']df2 = df[df['a'].str.contains('l')]print(df2)...原创 2019-02-23 06:15:45 · 3094 阅读 · 0 评论 -
kafka-python_&&_pysparkStreamingContext
# coding=utf-8from pyspark import SparkContextfrom pyspark.streaming import StreamingContextsc = SparkContext("yarn","stream_test")ssc = StreamingContext(sc,1)# monitor: nc -lk 9999lines = ...原创 2019-03-26 18:21:50 · 402 阅读 · 0 评论 -
pyspark中"and"条件使用注意
def getLevel(ltv): return fn.when((lv >= 6.81) & (lv <= 10.00),'S')\ .otherwise( fn.when((lv >= 6.08) & (lv < 6.81),'A')\ .otherwise( fn.when((l...原创 2019-04-29 16:50:26 · 1598 阅读 · 0 评论 -
pysaprk求max
#column:A,B# Method 1: Use describe()float(df.describe("A").filter("summary = 'max'").select("A").collect()[0].asDict()['A'])# Method 2: Use SQLdf.registerTempTable("df_table")spark.sql("SELECT...原创 2019-06-05 16:17:13 · 166 阅读 · 0 评论 -
sql自然周统计
-- 最近2个自然周统计:SELECT count(CASE WHEN (int(datediff(scheduled_date,'2001-01-01')/7) IN (int(datediff(CURRENT_DATE,'2001-01-01')/7)-2,int(datediff(CURRENT_DATE,'2001-01-01')/7)-1) AND class_status...原创 2019-06-24 20:02:06 · 4809 阅读 · 0 评论 -
用pyspark的方式写count(case when)
import pyspark.sql.functions as fnff = lambda cond: fn.countDistinct(fn.when(cond,df['s_id']).otherwise(None)cond = (df['class_status']=='FINISHED') & (df['finish_type']=='AS_SCHEDULED')df.gro...原创 2019-09-04 15:59:24 · 3737 阅读 · 0 评论 -
sql 之 rank
SELECTt5.student_id, max(CASE WHEN t5.rank=1 THEN t5.rating END) AS near_comment_score_to_teacher, max(CASE WHEN t5.rank=1 THEN t5.update_time END) AS near_comment_update_timeFROM(SELECTstudent...原创 2018-05-30 14:22:21 · 2011 阅读 · 0 评论 -
hadoop job kill
$ hadoop job -list$ hadoop job -kill job_2018xxxxxxxxx_12345原创 2018-05-30 11:03:07 · 474 阅读 · 0 评论 -
linux命令(杀任务,看日志,查目录大小,打包压缩,上传文件, pyspark json jar)
yarn application -listyarn application -killhadoop job -listhadoop job -kill原创 2018-06-05 17:24:50 · 1208 阅读 · 0 评论 -
MapReduce explained in 41 words
Goal: count the number of books in the library.Map: You count up shelf #1, I count up shelf #2.(The more people we get, the faster this part goes. )Reduce: We all get together and add up our individual原创 2017-07-23 17:59:18 · 230 阅读 · 0 评论 -
sqlserver性能优化
1.怎样查出SQLServer的性能瓶颈,2008 2.如何创建效率高sql-建立索引原创 2017-08-16 16:44:52 · 306 阅读 · 0 评论 -
SQL优化
SQL优化策略1原创 2017-08-10 14:41:50 · 170 阅读 · 0 评论 -
sql时间函数
sql时间函数原创 2017-08-11 17:27:06 · 196 阅读 · 0 评论 -
MySQL日期时间函数大全
MySQL日期时间函数大全 DAYOFWEEK(date) 返回日期date是星期几(1=星期天,2=星期一,……7=星期六,ODBC标准) mysql> select DAYOFWEEK(‘1998-02-03’); -> 3 WEEKDAY(date) 返回日期date是星期几(0=星期一,1=星期二,……6= 星期天)。 mysql> select WEEKDA转载 2017-09-18 16:45:43 · 431 阅读 · 0 评论 -
sqlserver 时间戳--日期 转换
SELECT DATEADD(S,1160701488 + 8 * 3600,'1970-01-01 00:00:00') --时间戳转换成普通时间 SELECT DATEDIFF(S,'1970-01-01 00:00:00', '2006-10-13 09:04:48.000') - 8 * 3600 --普通时间转换成时间戳原创 2017-11-15 18:07:29 · 27318 阅读 · 0 评论 -
优化hive的性能配置
-- 优化hive性能:tez,spark,lmpala,mapreduce; 矢量化--set hive.execution.engine = spark;set hive.vectorized.execution.enabled = true;set hive.vectorized.execution.reduce.enabled = true;原创 2018-04-17 10:19:37 · 354 阅读 · 0 评论 -
sql统计字段
SELECT activity_id, count(*) FROM activity_prize_lottery_record GROUP BY activity_id HAVING count(*) > 0output:+--------------+----------+--+| activity_id | _c1 |+--------------+--------...原创 2018-04-17 15:25:17 · 1429 阅读 · 0 评论 -
mysql小技巧_1
show PROCESSLIST ;SELECT now();SELECT DATE_SUB(CURDATE(), INTERVAL 30 DAY);CREATE DATABASE qx_test;CREATE TABLE IF NOT EXISTS qx_test.test ( title VARCHAR(64) NOT NULL PRIMARY KEY...原创 2018-04-19 00:30:22 · 128 阅读 · 0 评论 -
hive_sql优化
-- 优化时间计算导致的资源消耗 ---- 优化hive性能:tez,spark,lmpala,mapreduce; 矢量化---- set hive.execution.engine = spark;-- set hive.vectorized.execution.enabled = true;-- set hive.vectorized.execution.reduce.enabl...原创 2018-04-25 17:32:15 · 269 阅读 · 0 评论 -
hive测试服务器分区表感悟
-- hive表 创建时间,所在hdfs目录show create table <tablename>;-- hive表 更新时间show table extended like <tablename>;以下是,从公用服务器下载hive表为文件到本地,再上传至测试服务器的整个过程:1、 首先去查看公用服务器的所建hive表放在哪个目录,然后将Hadoop的hi...原创 2018-04-28 16:44:05 · 229 阅读 · 0 评论 -
hive关于ES的配置
首先在本地服务器放上,/home/yourname/elasticsearch-hadoop-5.2.0.jar 试试hdfs dfs -put elasticsearch-hadoop-5.2.0.jar /tmp/然后在hql文件头加上add jar file:/home/xieyulong/elasticsearch-hadoop-5.2.0.jar ;ok....原创 2018-05-08 18:21:52 · 700 阅读 · 1 评论 -
数据库端口号
数据库端口号mongodb: localhost: 27017.sqlserver默认端口号为:1433URL:"jdbc:microsoft:sqlserver://localhost:1433;DatabaseName=dbname"DRIVERNAME:"com.microsoft.jdbc.sqlserver.SQLServerDriver";mysql 默认端口号为:3306原创 2017-05-29 18:55:55 · 307 阅读 · 0 评论