TensorFlow
samoyan
分享技术成长的日常
展开
-
使用pyspark 的udf进行tensorflow 模型的预测报错 _pickle.PicklingError: Could not serialize object:
这个原因可能在于tf 的对象以及代码逻辑不支持序列化,或者需要专门的序列化操作,,,为了解决问题,将tf的代码逻辑放到一个新的文件当中,暴露一个预测接口,,将pyspark代码放到一个文件中,然后在pyspark代码中引入该接口,将其转成udf即可。具体报错: _pickle.PicklingError: Could not serialize object: TypeError: can't pickle _thread.RLock objects。之后就可以正常运行了。原创 2023-03-10 20:10:39 · 1473 阅读 · 1 评论 -
在finetune的时候优化bert中优化器AdamWeightDecayOptimizer
# bert源码中的AdamWeightDecayOptimizerclass AdamWeightDecayOptimizer(tf.train.Optimizer): """A basic Adam optimizer that includes "correct" L2 weight decay.""" def __init__(self, learning_rate, weight_decay_rate=0.0.原创 2022-04-26 15:25:09 · 1494 阅读 · 1 评论 -
使用 tf.estimator 训练或者finetune bert时,更改默认保存5个checkpoint 文件
通过更改tf.estimator.RunConfig 的配置项即可run_config = tf.estimator.RunConfig( model_dir=FLAGS.output_dir, save_summary_steps=FLAGS.save_summary_steps, save_checkpoints_steps=FLAGS.save_checkpoints_steps, session_config=sess_config.原创 2022-04-18 14:28:26 · 858 阅读 · 0 评论 -
java 版本 调用bert tf_serving时,将python转写为java
python 版本exam = self.processor.one_example(sentence) # 待预测的样本列表 feature = convert_single_example(0, exam, label_list, arg_dic['max_seq_length'], self.tokenizer) features = dict() features['input_ids'] = tf.train.Feature(int64_li原创 2022-04-01 20:54:18 · 1413 阅读 · 0 评论 -
bert 三种模型保存的方式以及调用方法总结(ckpt,单文件pb,tf_serving使用的pb)
1、在训练的过程中保存的ckpt文件:保存时主要有四个文件:1)checkpoint:指示当前目录有哪些模型文件以及最新的模型文件内容举例: model_checkpoint_path: "model.ckpt-2625" all_model_checkpoint_paths: "model.ckpt-2000" all_model_checkpoint_paths: "model.ckpt-2625"2)model.ckpt-2625.data-00000-of-000...原创 2022-04-01 20:36:01 · 5907 阅读 · 0 评论 -
ner 使用focal loss示例
import numpy as npdef get_one_hot(labels): #类别 num_classes = 3 #one_hot编码 one_hot_codes = np.eye(num_classes) one_hot_labels = [] for label in labels: #将连续的整型值映射为one_hot编码 one_hot_label = one_hot_codes[label] one_hot_labels.append(one_hot_l.原创 2022-03-29 17:51:58 · 2423 阅读 · 0 评论 -
查看tf serving使用的pb模型的输入输出
使用以下命令:saved_model_cli show --all --dir output_pb/输出如下: saved_model_cli show --all --dir output_pb/1648525103/2022-03-29 16:59:55.132993: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda原创 2022-03-29 17:01:25 · 2881 阅读 · 0 评论 -
指向性目标的实时联合检测分割网络MCN复现笔记
复现git :感谢大佬们!!!:GitHub - luogen1996/MCN: [CVPR2020] Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation, CVPR2020 (oral)[CVPR2020] Multi-task Collaborative Network for Joint Referring Expression Comprehension and原创 2021-12-22 10:33:53 · 381 阅读 · 0 评论 -
python实现dropout
主要是使用二项分布np.random.binomial() 生成对应的序列。import numpy as npdef dropout(input_data,prob): if prob < 0 or prob > 1: raise "error" retain_prob = 1- prob sample = np.random.binomial(1,retain_prob,input_data.shape) x = input_dat.原创 2021-11-12 17:58:54 · 2045 阅读 · 1 评论 -
ValueError: `generator` yielded an element of shape (2,) where an element of shape (?, ?) was expect
使用dataset = tf.data.Dataset.from_generator 构造网络的输入出现(1) Invalid argument: ValueError: `generator` yielded an element of shape (2,) where an element of shape (?, ?) was expected.Traceback (most recent call last): File "/usr/local/lib/python3.6/di.原创 2021-05-07 10:30:42 · 1960 阅读 · 0 评论 -
使用LDA(潜在迪利克雷)进行文本聚类
# -*- coding: utf-8 -*-import jiebafrom sklearn.feature_extraction.text import CountVectorizerfrom sklearn.decomposition import LatentDirichletAllocationfrom sklearn.decomposition import PCAimport matplotlib.pyplot as pltimport matplotlibmatplotli.原创 2021-03-23 11:38:05 · 658 阅读 · 0 评论 -
tf.reshape() 出现 TypeError: Failed to convert object of type <class ‘list‘> to Tensor. Contents: [-1,
tensorflow 程序运行时,tf.reshape()出现TypeError: Failed to convert object of type <class 'list'> to Tensor. Contents: [-1, None, 17]. Consider casting elements to a supported type.这是由于中间的None导致的,可以给它一个明确的值。#max_seq_length = FLAGS.max_seq_length #原创 2021-03-12 16:25:33 · 4392 阅读 · 0 评论 -
tensorflow中使用crf层用于ner
def project_crf_layer(self, embedding_chars, name=None): """ hidden layer between input layer and logits :param lstm_outputs: [batch_size, num_steps, emb_size] :return: [batch_size, num_steps, num_tags] """ ...原创 2021-03-11 15:04:54 · 686 阅读 · 0 评论 -
基于tensorflow实现blstm
BLSTM先上代码classBLSTM(object):def__init__(self,embedded_chars,hidden_unit,cell_type,num_layers,dropout_rate,initializers,num_labels,seq_length,labels,lengths,is_training):"""BLSTM 网络:paramem...原创 2021-03-11 14:36:02 · 1022 阅读 · 0 评论 -
抽取bert某几层参数保存
import tensorflow as tfimport ossess = tf.Session()last_name = 'bert_model.ckpt'model_path = 'chinese_L-12_H-768_A-12'imported_meta = tf.train.import_meta_graph(os.path.join(model_path, last_name + '.meta'))imported_meta.restore(sess, os.path.jo...原创 2021-03-10 15:38:33 · 861 阅读 · 0 评论 -
Tensorflow 实现Bert 做文本分类时loss加上L2 loss
方法一:tvars=tf.trainable_variables()regularizer=tf.contrib.layers.l2_regularizer(1e-4)l2_loss=0.0forvarintvars:l2_loss=l2_loss+regularizer(var)loss=loss+l2_loss方法二: tvars...原创 2020-12-14 15:00:30 · 405 阅读 · 0 评论 -
focal loss 的 二分类以及多分类实现
1、tf 版本# 二分类def binary_focal_loss(gamma=2, alpha=0.25): alpha = tf.constant(alpha, dtype=tf.float32) gamma = tf.constant(gamma, dtype=tf.float32) def focal_loss_sigmoid(y_true, y_pred): labels = tf.cast(y_true, tf.float32) L原创 2020-12-07 11:53:10 · 3903 阅读 · 6 评论 -
TensorFlow里的tf.flags 使用方法说明
import tensorflow as tf#定义flags变量flags = tf.flagsFLAGS = flags.FLAGS#定义变量里的值flags.DEFINE_string("path","/data","data's path")flags.DEFINE_double("money",20.0,"the value of money")flags.DEFINE_integer("step",20,"training step")flags.DEFINE_bool("T.原创 2020-11-25 10:50:22 · 617 阅读 · 0 评论