文末有福利领取哦~
👉一、Python所有方向的学习路线
Python所有方向的技术点做的整理,形成各个领域的知识点汇总,它的用处就在于,你可以按照上面的知识点去找对应的学习资源,保证自己学得较为全面。
👉二、Python必备开发工具
👉三、Python视频合集
观看零基础学习视频,看视频学习是最快捷也是最有效果的方式,跟着视频中老师的思路,从基础到深入,还是很容易入门的。
👉 四、实战案例
光学理论是没用的,要学会跟着一起敲,要动手实操,才能将自己的所学运用到实际当中去,这时候可以搞点实战案例来学习。(文末领读者福利)
👉五、Python练习题
检查学习结果。
👉六、面试资料
我们学习Python必然是为了找到高薪的工作,下面这些面试题是来自阿里、腾讯、字节等一线互联网大厂最新的面试资料,并且有阿里大佬给出了权威的解答,刷完这一套面试资料相信大家都能找到满意的工作。
👉因篇幅有限,仅展示部分资料,这份完整版的Python全套学习资料已经上传
网上学习资料一大堆,但如果学到的知识不成体系,遇到问题时只是浅尝辄止,不再深入研究,那么很难做到真正的技术提升。
一个人可以走的很快,但一群人才能走的更远!不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!
def create_model(params):
model = getattr(ternausnet.models, params[“model”])(pretrained=True)
model = model.to(params[“device”])
return model
def train_and_validate(model, train_dataset, val_dataset, params):
train_loader = DataLoader(
train_dataset,
batch_size=params[“batch_size”],
shuffle=True,
num_workers=params[“num_workers”],
pin_memory=True,
)
val_loader = DataLoader(
val_dataset,
batch_size=params[“batch_size”],
shuffle=False,
num_workers=params[“num_workers”],
pin_memory=True,
)
criterion = nn.BCEWithLogitsLoss().to(params[“device”])
optimizer = torch.optim.Adam(model.parameters(), lr=params[“lr”])
for epoch in range(1, params[“epochs”] + 1):
train(train_loader, model, criterion, optimizer, epoch, params)
validate(val_loader, model, criterion, epoch, params)
return model
def predict(model, params, test_dataset, batch_size):
test_loader = DataLoader(
test_dataset, batch_size=batch_size, shuffle=False, num_workers=params[“num_workers”], pin_memory=True,
)
model.eval()
predictions = []
with torch.no_grad():
for images, (original_heights, original_widths) in test_loader:
images = images.to(params[“device”], non_blocking=True)
output = model(images)
probabilities = torch.sigmoid(output.squeeze(1))
predicted_masks = (probabilities >= 0.5).float() * 1
predicted_masks = predicted_masks.cpu().numpy()
for predicted_mask, original_height, original_width in zip(
predicted_masks, original_heights.numpy(), original_widths.numpy()
):
predictions.append((predicted_mask, original_height, original_width))
return predictions
定义训练参数
在这里,我们定义了一些训练参数,例如模型架构,学习率,批量大小,epoch等。
params = {
“model”: “UNet11”,
“device”: “cuda”,
“lr”: 0.001,
“batch_size”: 16,
“num_workers”: 4,
“epochs”: 10,
}
训练模型
model = create_model(params)
model = train_and_validate(model, train_dataset, val_dataset, params)
Epoch: 1. Train. Loss: 0.415: 100%|██████████| 375/375 [01:42<00:00, 3.66it/s]
Epoch: 1. Validation. Loss: 0.210: 100%|██████████| 86/86 [00:09<00:00, 9.55it/s]
Epoch: 2. Train. Loss: 0.257: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 2. Validation. Loss: 0.178: 100%|██████████| 86/86 [00:08<00:00, 10.62it/s]
Epoch: 3. Train. Loss: 0.221: 100%|██████████| 375/375 [01:39<00:00, 3.75it/s]
Epoch: 3. Validation. Loss: 0.168: 100%|██████████| 86/86 [00:08<00:00, 10.58it/s]
Epoch: 4. Train. Loss: 0.209: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 4. Validation. Loss: 0.156: 100%|██████████| 86/86 [00:08<00:00, 10.57it/s]
Epoch: 5. Train. Loss: 0.190: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 5. Validation. Loss: 0.149: 100%|██████████| 86/86 [00:08<00:00, 10.57it/s]
Epoch: 6. Train. Loss: 0.179: 100%|██████████| 375/375 [01:39<00:00, 3.75it/s]
Epoch: 6. Validation. Loss: 0.155: 100%|██████████| 86/86 [00:08<00:00, 10.55it/s]
Epoch: 7. Train. Loss: 0.175: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 7. Validation. Loss: 0.147: 100%|██████████| 86/86 [00:08<00:00, 10.59it/s]
Epoch: 8. Train. Loss: 0.167: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 8. Validation. Loss: 0.146: 100%|██████████| 86/86 [00:08<00:00, 10.61it/s]
Epoch: 9. Train. Loss: 0.165: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 9. Validation. Loss: 0.131: 100%|██████████| 86/86 [00:08<00:00, 10.56it/s]
Epoch: 10. Train. Loss: 0.156: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 10. Validation. Loss: 0.140: 100%|██████████| 86/86 [00:08<00:00, 10.60it/s]
预测图像标签并可视化这些预测
现在我们有了训练好的模型,因此让我们尝试预测某些图像的蒙版。 请注意,__ getitem__方法不仅返回图像,还返回图像的原始高度和宽度。 我们将使用这些值将预测蒙版的大小从256x256像素调整为原始图像的大小。
class OxfordPetInferenceDataset(Dataset):
def init(self, images_filenames, images_directory, transform=None):
self.images_filenames = images_filenames
self.images_directory = images_directory
self.transform = transform
def len(self):
return len(self.images_filenames)
def getitem(self, idx):
image_filename = self.images_filenames[idx]
image = cv2.imread(os.path.join(self.images_directory, image_filename))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
original_size = tuple(image.shape[:2])
if self.transform is not None:
transformed = self.transform(image=image)
image = transformed[“image”]
return image, original_size
test_transform = A.Compose(
[A.Resize(256, 256), A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)), ToTensorV2()]
)
test_dataset = OxfordPetInferenceDataset(test_images_filenames, images_directory, transform=test_transform,)
predictions = predict(model, params, test_dataset, batch_size=16)
接下来,我们将将256x256像素大小的预测蒙版调整为原始图像的大小。
predicted_masks = []
for predicted_256x256_mask, original_height, original_width in predictions:
full_sized_mask = F.resize(
predicted_256x256_mask, height=original_height, width=original_width, interpolation=cv2.INTER_NEAREST
)
predicted_masks.append(full_sized_mask)
display_image_grid(test_images_filenames, images_directory, masks_directory, predicted_masks=predicted_masks)
方法1的完整代码(亲测可运行)
===============
from collections import defaultdict
import copy
import random
import os
import shutil
from urllib.request import urlretrieve
import albumentations as A
import albumentations.augmentations.functional as F
from albumentations.pytorch import ToTensorV2
import cv2
import matplotlib.pyplot as plt
import numpy as np
import ternausnet.models
from tqdm import tqdm
import torch
import torch.backends.cudnn as cudnn
import torch.nn as nn
import torch.optim
from torch.utils.data import Dataset, DataLoader
cudnn.benchmark = True
class TqdmUpTo(tqdm):
def update_to(self, b=1, bsize=1, tsize=None):
if tsize is not None:
self.total = tsize
self.update(b * bsize - self.n)
def download_url(url, filepath):
directory = os.path.dirname(os.path.abspath(filepath))
os.makedirs(directory, exist_ok=True)
if os.path.exists(filepath):
print(“Dataset already exists on the disk. Skipping download.”)
return
with TqdmUpTo(unit=“B”, unit_scale=True, unit_divisor=1024, miniters=1, desc=os.path.basename(filepath)) as t:
urlretrieve(url, filename=filepath, reporthook=t.update_to, data=None)
t.total = t.n
def extract_archive(filepath):
extract_dir = os.path.dirname(os.path.abspath(filepath))
shutil.unpack_archive(filepath, extract_dir)
def preprocess_mask(mask):
mask = mask.astype(np.float32)
mask[mask == 2.0] = 0.0
mask[(mask == 1.0) | (mask == 3.0)] = 1.0
return mask
def display_image_grid(images_filenames, images_directory, masks_directory, predicted_masks=None):
cols = 3 if predicted_masks else 2
rows = len(images_filenames)
figure, ax = plt.subplots(nrows=rows, ncols=cols, figsize=(10, 24))
for i, image_filename in enumerate(images_filenames):
image = cv2.imread(os.path.join(images_directory, image_filename))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
mask = cv2.imread(os.path.join(masks_directory, image_filename.replace(“.jpg”, “.png”)), cv2.IMREAD_UNCHANGED, )
mask = preprocess_mask(mask)
ax[i, 0].imshow(image)
ax[i, 1].imshow(mask, interpolation=“nearest”)
ax[i, 0].set_title(“Image”)
ax[i, 1].set_title(“Ground truth mask”)
ax[i, 0].set_axis_off()
ax[i, 1].set_axis_off()
if predicted_masks:
predicted_mask = predicted_masks[i]
ax[i, 2].imshow(predicted_mask, interpolation=“nearest”)
ax[i, 2].set_title(“Predicted mask”)
ax[i, 2].set_axis_off()
plt.tight_layout()
plt.show()
class OxfordPetDataset(Dataset):
def init(self, images_filenames, images_directory, masks_directory, transform=None):
self.images_filenames = images_filenames
self.images_directory = images_directory
self.masks_directory = masks_directory
self.transform = transform
def len(self):
return len(self.images_filenames)
def getitem(self, idx):
image_filename = self.images_filenames[idx]
image = cv2.imread(os.path.join(self.images_directory, image_filename))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
mask = cv2.imread(
os.path.join(self.masks_directory, image_filename.replace(“.jpg”, “.png”)), cv2.IMREAD_UNCHANGED,
)
mask = preprocess_mask(mask)
if self.transform is not None:
transformed = self.transform(image=image, mask=mask)
image = transformed[“image”]
mask = transformed[“mask”]
return image, mask
def visualize_augmentations(dataset, idx=0, samples=5):
dataset = copy.deepcopy(dataset)
dataset.transform = A.Compose([t for t in dataset.transform if not isinstance(t, (A.Normalize, ToTensorV2))])
figure, ax = plt.subplots(nrows=samples, ncols=2, figsize=(10, 24))
for i in range(samples):
image, mask = dataset[idx]
ax[i, 0].imshow(image)
ax[i, 1].imshow(mask, interpolation=“nearest”)
ax[i, 0].set_title(“Augmented image”)
ax[i, 1].set_title(“Augmented mask”)
ax[i, 0].set_axis_off()
ax[i, 1].set_axis_off()
plt.tight_layout()
plt.show()
class MetricMonitor:
def init(self, float_precision=3):
self.float_precision = float_precision
self.reset()
def reset(self):
self.metrics = defaultdict(lambda: {“val”: 0, “count”: 0, “avg”: 0})
def update(self, metric_name, val):
metric = self.metrics[metric_name]
metric[“val”] += val
metric[“count”] += 1
metric[“avg”] = metric[“val”] / metric[“count”]
def str(self):
return " | ".join(
[
“{metric_name}: {avg:.{float_precision}f}”.format(
metric_name=metric_name, avg=metric[“avg”], float_precision=self.float_precision
)
for (metric_name, metric) in self.metrics.items()
]
)
def train(train_loader, model, criterion, optimizer, epoch, params):
metric_monitor = MetricMonitor()
model.train()
stream = tqdm(train_loader)
for i, (images, target) in enumerate(stream, start=1):
images = images.to(params[“device”], non_blocking=True)
target = target.to(params[“device”], non_blocking=True)
output = model(images).squeeze(1)
loss = criterion(output, target)
metric_monitor.update(“Loss”, loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
stream.set_description(
“Epoch: {epoch}. Train. {metric_monitor}”.format(epoch=epoch, metric_monitor=metric_monitor)
)
def validate(val_loader, model, criterion, epoch, params):
metric_monitor = MetricMonitor()
model.eval()
stream = tqdm(val_loader)
with torch.no_grad():
for i, (images, target) in enumerate(stream, start=1):
images = images.to(params[“device”], non_blocking=True)
target = target.to(params[“device”], non_blocking=True)
output = model(images).squeeze(1)
loss = criterion(output, target)
metric_monitor.update(“Loss”, loss.item())
stream.set_description(
“Epoch: {epoch}. Validation. {metric_monitor}”.format(epoch=epoch, metric_monitor=metric_monitor)
)
def create_model(params):
model = getattr(ternausnet.models, params[“model”])(pretrained=True)
model = model.to(params[“device”])
return model
def train_and_validate(model, train_dataset, val_dataset, params):
train_loader = DataLoader(
train_dataset,
batch_size=params[“batch_size”],
shuffle=True,
num_workers=params[“num_workers”],
pin_memory=True,
)
val_loader = DataLoader(
val_dataset,
batch_size=params[“batch_size”],
shuffle=False,
num_workers=params[“num_workers”],
pin_memory=True,
)
criterion = nn.BCEWithLogitsLoss().to(params[“device”])
optimizer = torch.optim.Adam(model.parameters(), lr=params[“lr”])
for epoch in range(1, params[“epochs”] + 1):
train(train_loader, model, criterion, optimizer, epoch, params)
validate(val_loader, model, criterion, epoch, params)
return model
def predict(model, params, test_dataset, batch_size):
test_loader = DataLoader(
test_dataset, batch_size=batch_size, shuffle=False, num_workers=params[“num_workers”], pin_memory=True,
)
model.eval()
predictions = []
with torch.no_grad():
for images, (original_heights, original_widths) in test_loader:
images = images.to(params[“device”], non_blocking=True)
output = model(images)
probabilities = torch.sigmoid(output.squeeze(1))
predicted_masks = (probabilities >= 0.5).float() * 1
predicted_masks = predicted_masks.cpu().numpy()
for predicted_mask, original_height, original_width in zip(
predicted_masks, original_heights.numpy(), original_widths.numpy()
):
predictions.append((predicted_mask, original_height, original_width))
return predictions
class OxfordPetInferenceDataset(Dataset):
def init(self, images_filenames, images_directory, transform=None):
self.images_filenames = images_filenames
self.images_directory = images_directory
self.transform = transform
def len(self):
return len(self.images_filenames)
def getitem(self, idx):
image_filename = self.images_filenames[idx]
image = cv2.imread(os.path.join(self.images_directory, image_filename))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
original_size = tuple(image.shape[:2])
if self.transform is not None:
transformed = self.transform(image=image)
image = transformed[“image”]
return image, original_size
if name == ‘main’:
dataset_directory = “datasets/oxford-iiit-pet”
filepath = os.path.join(dataset_directory, “images.tar.gz”)
download_url(
url=“https://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz”, filepath=filepath,
)
extract_archive(filepath)
filepath = os.path.join(dataset_directory, “annotations.tar.gz”)
download_url(
url=“https://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz”, filepath=filepath,
)
extract_archive(filepath)
root_directory = os.path.join(dataset_directory)
images_directory = os.path.join(root_directory, “images”)
masks_directory = os.path.join(root_directory, “annotations”, “trimaps”)
images_filenames = list(sorted(os.listdir(images_directory)))
correct_images_filenames = [i for i in images_filenames if
cv2.imread(os.path.join(images_directory, i)) is not None]
random.seed(42)
random.shuffle(correct_images_filenames)
train_images_filenames = correct_images_filenames[:6000]
val_images_filenames = correct_images_filenames[6000:-10]
test_images_filenames = images_filenames[-10:]
print(len(train_images_filenames), len(val_images_filenames), len(test_images_filenames))
display_image_grid(test_images_filenames, images_directory, masks_directory)
example_image_filename = correct_images_filenames[0]
image = cv2.imread(os.path.join(images_directory, example_image_filename))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
resized_image = F.resize(image, height=256, width=256)
padded_image = F.pad(image, min_height=512, min_width=512)
padded_constant_image = F.pad(image, min_height=512, min_width=512, border_mode=cv2.BORDER_CONSTANT)
cropped_image = F.center_crop(image, crop_height=256, crop_width=256)
figure, ax = plt.subplots(nrows=1, ncols=5, figsize=(18, 10))
ax.ravel()[0].imshow(image)
ax.ravel()[0].set_title(“Original image”)
ax.ravel()[1].imshow(resized_image)
ax.ravel()[1].set_title(“Resized image”)
ax.ravel()[2].imshow(cropped_image)
ax.ravel()[2].set_title(“Cropped image”)
ax.ravel()[3].imshow(padded_image)
ax.ravel()[3].set_title(“Image padded with reflection”)
ax.ravel()[4].imshow(padded_constant_image)
ax.ravel()[4].set_title(“Image padded with constant padding”)
plt.tight_layout()
plt.show()
train_transform = A.Compose(
[
A.Resize(256, 256),
A.ShiftScaleRotate(shift_limit=0.2, scale_limit=0.2, rotate_limit=30, p=0.5),
A.RGBShift(r_shift_limit=25, g_shift_limit=25, b_shift_limit=25, p=0.5),
A.RandomBrightnessContrast(brightness_limit=0.3, contrast_limit=0.3, p=0.5),
A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
ToTensorV2(),
]
)
train_dataset = OxfordPetDataset(train_images_filenames, images_directory, masks_directory,
transform=train_transform, )
val_transform = A.Compose(
[A.Resize(256, 256), A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)), ToTensorV2()]
)
val_dataset = OxfordPetDataset(val_images_filenames, images_directory, masks_directory, transform=val_transform, )
random.seed(42)
visualize_augmentations(train_dataset, idx=55)
params = {
“model”: “UNet11”,
“device”: “cuda”,
“lr”: 0.001,
“batch_size”: 8,
“num_workers”: 0,
“epochs”: 10,
}
model = create_model(params)
model = train_and_validate(model, train_dataset, val_dataset, params)
test_transform = A.Compose(
[A.Resize(256, 256), A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)), ToTensorV2()]
)
test_dataset = OxfordPetInferenceDataset(test_images_filenames, images_directory, transform=test_transform, )
predictions = predict(model, params, test_dataset, batch_size=16)
predicted_masks = []
for predicted_256x256_mask, original_height, original_width in predictions:
full_sized_mask = F.resize(
predicted_256x256_mask, height=original_height, width=original_width, interpolation=cv2.INTER_NEAREST
)
predicted_masks.append(full_sized_mask)
display_image_grid(test_images_filenames, images_directory, masks_directory, predicted_masks=predicted_masks)
方法2.预测全尺寸图像的蒙版
==============
我们将重用上一个示例中的大多数代码。
数据集中同一图像的高度和宽度小于裁剪大小(256x256像素),因此我们首先应用A.PadIfNeeded(min_height = 256,min_width = 256),如果图像的高度或宽度小于 256像素。
train_transform = A.Compose(
[
A.PadIfNeeded(min_height=256, min_width=256),
A.RandomCrop(256, 256),
A.ShiftScaleRotate(shift_limit=0.05, scale_limit=0.05, rotate_limit=15, p=0.5),
A.RGBShift(r_shift_limit=15, g_shift_limit=15, b_shift_limit=15, p=0.5),
A.RandomBrightnessContrast(p=0.5),
A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
ToTensorV2(),
]
)
train_dataset = OxfordPetDataset(train_images_filenames, images_directory, masks_directory, transform=train_transform,)
val_transform = A.Compose(
[
A.PadIfNeeded(min_height=256, min_width=256),
A.CenterCrop(256, 256),
A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
ToTensorV2(),
]
)
val_dataset = OxfordPetDataset(val_images_filenames, images_directory, masks_directory, transform=val_transform,)
params = {
“model”: “UNet11”,
“device”: “cuda”,
“lr”: 0.001,
“batch_size”: 16,
“num_workers”: 4,
“epochs”: 10,
}
model = create_model(params)
model = train_and_validate(model, train_dataset, val_dataset, params)
Epoch: 1. Train. Loss: 0.445: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 1. Validation. Loss: 0.279: 100%|██████████| 86/86 [00:08<00:00, 10.49it/s]
Epoch: 2. Train. Loss: 0.311: 100%|██████████| 375/375 [01:39<00:00, 3.75it/s]
Epoch: 2. Validation. Loss: 0.238: 100%|██████████| 86/86 [00:08<00:00, 10.51it/s]
Epoch: 3. Train. Loss: 0.259: 100%|██████████| 375/375 [01:39<00:00, 3.75it/s]
Epoch: 3. Validation. Loss: 0.206: 100%|██████████| 86/86 [00:08<00:00, 10.54it/s]
Epoch: 4. Train. Loss: 0.244: 100%|██████████| 375/375 [01:39<00:00, 3.75it/s]
Epoch: 4. Validation. Loss: 0.211: 100%|██████████| 86/86 [00:08<00:00, 10.54it/s]
Epoch: 5. Train. Loss: 0.224: 100%|██████████| 375/375 [01:40<00:00, 3.74it/s]
Epoch: 5. Validation. Loss: 0.270: 100%|██████████| 86/86 [00:08<00:00, 10.47it/s]
Epoch: 6. Train. Loss: 0.207: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 6. Validation. Loss: 0.169: 100%|██████████| 86/86 [00:08<00:00, 10.56it/s]
Epoch: 7. Train. Loss: 0.212: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 7. Validation. Loss: 0.169: 100%|██████████| 86/86 [00:08<00:00, 10.56it/s]
Epoch: 8. Train. Loss: 0.189: 100%|██████████| 375/375 [01:40<00:00, 3.75it/s]
Epoch: 8. Validation. Loss: 0.201: 100%|██████████| 86/86 [00:08<00:00, 10.52it/s]
Epoch: 9. Train. Loss: 0.185: 100%|██████████| 375/375 [01:39<00:00, 3.75it/s]
Epoch: 9. Validation. Loss: 0.162: 100%|██████████| 86/86 [00:08<00:00, 10.54it/s]
Epoch: 10. Train. Loss: 0.187: 100%|██████████| 375/375 [01:39<00:00, 3.75it/s]
Epoch: 10. Validation. Loss: 0.159: 100%|██████████| 86/86 [00:08<00:00, 10.49it/s]
测试数据集中的所有图像的最大边尺寸为500像素。 由于PyTorch要求一批中的所有图像都必须具有相同的尺寸,并且UNet要求图像的大小可以被16整除,因此我们将应用A.PadIfNeeded(min_height = 512,min_width = 512,border_mode = cv2。 BORDER_CONSTANT)。 该增加将使图像边界填充零,因此图像大小将变为512x512像素。
test_transform = A.Compose(
[
A.PadIfNeeded(min_height=512, min_width=512, border_mode=cv2.BORDER_CONSTANT),
A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
ToTensorV2(),
]
)
test_dataset = OxfordPetInferenceDataset(test_images_filenames, images_directory, transform=test_transform,)
predictions = predict(model, params, test_dataset, batch_size=16)
由于我们收到了填充图像的蒙版,因此我们需要从填充蒙版中裁剪出原始图像尺寸的一部分。
predicted_masks = []
for predicted_padded_mask, original_height, original_width in predictions:
cropped_mask = F.center_crop(predicted_padded_mask, original_height, original_width)
predicted_masks.append(cropped_mask)
display_image_grid(test_images_filenames, images_directory, masks_directory, predicted_masks=predicted_masks)
方法3.使用原始图像。
===========
我们也可以使用原始图像而无需调整大小或裁剪它们。 但是,此数据集存在问题。 数据集中的一些图像是如此之大,以至于即使batch_size = 1,它们也需要超过11Gb的GPU内存进行训练。 因此,作为折衷方案,我们将首先应用A.LongestMaxSize(512)增强功能,以确保图像的最大尺寸不超过512像素。 在7384个数据集图像中,这种增加将仅影响137个。
接下来将使用A.PadIfNeeded(min_height = 512,min_width = 512)确保一批中的所有图像尺寸均为512x512像素。
train_transform = A.Compose(
[
A.LongestMaxSize(512),
A.PadIfNeeded(min_height=512, min_width=512),
A.ShiftScaleRotate(shift_limit=0.05, scale_limit=0.05, rotate_limit=15, p=0.5),
A.RGBShift(r_shift_limit=15, g_shift_limit=15, b_shift_limit=15, p=0.5),
A.RandomBrightnessContrast(p=0.5),
A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
ToTensorV2(),
]
)
train_dataset = OxfordPetDataset(train_images_filenames, images_directory, masks_directory, transform=train_transform,)
val_transform = A.Compose(
[
A.LongestMaxSize(512),
A.PadIfNeeded(min_height=512, min_width=512, border_mode=cv2.BORDER_CONSTANT),
A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
ToTensorV2(),
]
)
val_dataset = OxfordPetDataset(val_images_filenames, images_directory, masks_directory, transform=val_transform,)
params = {
“model”: “UNet11”,
“device”: “cuda”,
“lr”: 0.001,
“batch_size”: 8,
“num_workers”: 4,
“epochs”: 10,
}
model = create_model(params)
model = train_and_validate(model, train_dataset, val_dataset, params)
Epoch: 1. Train. Loss: 0.442: 100%|██████████| 750/750 [06:58<00:00, 1.79it/s]
Epoch: 1. Validation. Loss: 0.225: 100%|██████████| 172/172 [00:35<00:00, 4.80it/s]
Epoch: 2. Train. Loss: 0.283: 100%|██████████| 750/750 [06:54<00:00, 1.81it/s]
Epoch: 2. Validation. Loss: 0.188: 100%|██████████| 172/172 [00:34<00:00, 4.99it/s]
Epoch: 3. Train. Loss: 0.234: 100%|██████████| 750/750 [06:53<00:00, 1.81it/s]
Epoch: 3. Validation. Loss: 0.154: 100%|██████████| 172/172 [00:34<00:00, 4.96it/s]
Epoch: 4. Train. Loss: 0.211: 100%|██████████| 750/750 [06:53<00:00, 1.81it/s]
Epoch: 4. Validation. Loss: 0.136: 100%|██████████| 172/172 [00:34<00:00, 4.99it/s]
Epoch: 5. Train. Loss: 0.196: 100%|██████████| 750/750 [06:53<00:00, 1.81it/s]
Epoch: 5. Validation. Loss: 0.131: 100%|██████████| 172/172 [00:34<00:00, 4.96it/s]
Epoch: 6. Train. Loss: 0.187: 100%|██████████| 750/750 [06:53<00:00, 1.81it/s]
Epoch: 6. Validation. Loss: 0.151: 100%|██████████| 172/172 [00:34<00:00, 4.98it/s]
Epoch: 7. Train. Loss: 0.177: 100%|██████████| 750/750 [06:53<00:00, 1.81it/s]
Epoch: 7. Validation. Loss: 0.127: 100%|██████████| 172/172 [00:34<00:00, 4.98it/s]
Epoch: 8. Train. Loss: 0.171: 100%|██████████| 750/750 [06:53<00:00, 1.81it/s]
Epoch: 8. Validation. Loss: 0.113: 100%|██████████| 172/172 [00:34<00:00, 4.99it/s]
Epoch: 9. Train. Loss: 0.162: 100%|██████████| 750/750 [06:54<00:00, 1.81it/s]
Epoch: 9. Validation. Loss: 0.143: 100%|██████████| 172/172 [00:34<00:00, 4.94it/s]
Epoch: 10. Train. Loss: 0.157: 100%|██████████| 750/750 [06:53<00:00, 1.81it/s]
Epoch: 10. Validation. Loss: 0.115: 100%|██████████| 172/172 [00:34<00:00, 4.97it/s]
接下来,我们将使用与方法2中相同的代码进行预测。
test_transform = A.Compose(
一、Python所有方向的学习路线
Python所有方向路线就是把Python常用的技术点做整理,形成各个领域的知识点汇总,它的用处就在于,你可以按照上面的知识点去找对应的学习资源,保证自己学得较为全面。
二、学习软件
工欲善其事必先利其器。学习Python常用的开发软件都在这里了,给大家节省了很多时间。
三、入门学习视频
我们在看视频学习的时候,不能光动眼动脑不动手,比较科学的学习方法是在理解之后运用它们,这时候练手项目就很适合了。
网上学习资料一大堆,但如果学到的知识不成体系,遇到问题时只是浅尝辄止,不再深入研究,那么很难做到真正的技术提升。
一个人可以走的很快,但一群人才能走的更远!不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!