import copy import gc import io import json import math import os import random import re import sys import threading import time import traceback from glob import glob from itertools import combinations, product import matplotlib # 不显示图 # matplotlib.use('agg') # print(matplotlib.get_backend()) import matplotlib.pyplot as plt import matplotlib.patheffects as path_effects import chardet import cv2 import jieba import numpy as np from PIL import ImageFont, ImageDraw, Image, ImageEnhance, ImageFilter from captcha.image import ImageCaptcha from keras_preprocessing.sequence import pad_sequences from matplotlib import _pylab_helpers from matplotlib.colors import rgb_to_hsv, hsv_to_rgb from click_captcha.model import u_net_denoise sys.path.append(os.path.dirname(os.path.abspath(__file__)) + "/../") sys.path.append(os.path.dirname(os.path.abspath(__file__))) from click_captcha.utils import np2pil, pil2np, pil_resize, pil_rotate, pil2np_a, np2pil_a, pil_resize_a, pil_rotate_a fig = plt.figure(figsize=(1, 1), facecolor='none') def gen_siamese(paths, batch_size=32, shape=(40, 40), cls_num=1): num = len(paths) data_path = os.path.dirname(os.path.abspath(__file__)) + "/../data/click/" i = 0 while True: if i >= num: i = 0 random.shuffle(paths) height, width = shape[:2] X1 = np.zeros((batch_size, height, width, 1)) X2 = np.zeros((batch_size, height, width, 1)) Y = np.zeros((batch_size, 2)) for j in range(batch_size): # 生成标注数据 img1, img2, label = paths[i][:-1].split("\t") # print(img1, img2, label) img1 = cv2.imread(data_path + img1) img1 = pil_resize(img1, shape[0], shape[1]) img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) img1 = np.expand_dims(img1, axis=-1) img2 = cv2.imread(data_path + img2) img2 = pil_resize(img2, shape[0], shape[1]) img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY) img2 = np.expand_dims(img2, axis=-1) if label == "1": label = np.array([0, 1]) else: label = np.array([1, 0]) X1[j] = img1 X2[j] = img2 Y[j] = label yield {"input_1": X1, "input_2": X2}, {"output": Y} def gen_char(paths, batch_size=32, shape=(40, 40), cls_num=6270, data_path="click"): num = len(paths) data_path = os.path.dirname(os.path.abspath(__file__)) + "/../data/" + data_path + "/" i = 0 random.shuffle(paths) while True: if i >= num: i = 0 random.shuffle(paths) height, width = shape[:2] if len(shape) > 2 and shape[2] == 3: channel = 3 else: channel = 1 X = np.zeros((batch_size, height, width, channel)) Y = np.zeros((batch_size, cls_num)) j = 0 random.shuffle(paths) while j < batch_size: if i >= num: random.shuffle(paths) i = 0 path = paths[i].split(os.sep)[-1] char_index = int(path.split("_")[0]) label = np.zeros(cls_num) # print("char_index", char_index) label[char_index] = 1 # print("label", np.argmax(label), char_index) img1 = cv2.imread(data_path + path) img1 = pil_resize(img1, shape[0], shape[1]) # img2 = copy.deepcopy(img1) # 数据增强 if random.choice([0, 1]): img1_pil = np2pil(img1) aug = random.choice([0, 1, 2, 4, 5]) # aug = 6 if aug == 0: img1_pil = image_enhance_color(img1_pil) elif aug == 1: img1_pil = image_enhance_brightness(img1_pil) elif aug == 3: img1_pil = image_enhance_contrast(img1_pil) elif aug == 4: img1_pil = image_enhance_sharpness(img1_pil) elif aug == 5: img1_pil = image_enhance_blur(img1_pil) img1 = pil2np(img1_pil) if aug == 6: img1 = image_enhance_distort(img1) # print(aug) # cv2.imshow("origin", img2) # cv2.imshow("gen_char", img1) # cv2.waitKey(0) img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) img1 = np.expand_dims(img1, axis=-1) img1 = img1 / 255. X[j] = img1 Y[j] = label i += 1 j += 1 # print("error_num", error_num) yield X, Y def gen_yolo_char(paths, batch_size, input_shape, anchors, num_classes, box_num=6): """data generator for fit_generator""" n = len(paths) data_path = os.path.dirname(os.path.abspath(__file__)) + "/../data/detect/" i = 0 while True: image_data = [] box_data = [] batch_cnt = 0 while batch_cnt < batch_size: try: if i == 0: np.random.shuffle(paths) ss = paths[i][:-1].split(" ") image_path = ss[0] image = cv2.imread(data_path+image_path) origin_h, origin_w = image.shape[:2] image = pil_resize(image, input_shape[0], input_shape[1]) # 数据增强 if random.choice([0, 0, 1]): img1_pil = np2pil(image) aug = random.choice([0, 1, 2, 4, 5]) if aug == 0: img1_pil = image_enhance_color(img1_pil) elif aug == 1: img1_pil = image_enhance_brightness(img1_pil) elif aug == 3: img1_pil = image_enhance_contrast(img1_pil) elif aug == 4: img1_pil = image_enhance_sharpness(img1_pil) elif aug == 5: img1_pil = image_enhance_blur(img1_pil) image = pil2np(img1_pil) image_show = copy.deepcopy(image) image = image / 255. box = np.array([np.array(list(map(int, box.split(',')))) for box in ss[1:]]) # box数不同,复制 # print("box.shape", box.shape) if box.shape[0] < box_num: # box = np.concatenate([box]+[np.zeros(5)]*(box_num-box.shape[0]), axis=0) if box_num-box.shape[0] >= box.shape[0]: box = np.concatenate([box]+[box[0, :]]*(box_num-box.shape[0]), axis=0) else: box = np.concatenate([box, box[:box_num-box.shape[0], :]], axis=0) # show # box_show = box.tolist() # for b in box_show: # print("box", b) # cv2.rectangle(image_show, (b[0], b[1]), (b[2], b[3]), (255, 0, 0), 2) # cv2.imshow("image_show", image_show) # cv2.waitKey(0) # print("box.shape", box.shape) image_data.append(image) box_data.append(box) i = (i+1) % n batch_cnt += 1 except: i = (i+1) % n continue # print # print(image.shape) # image_show = (image*255).astype(np.uint8) # print("annotation_lines[i]", annotation_lines[i]) # for _b in box: # print(_b) # cv2.rectangle(image_show, (int(_b[0]), int(_b[1])), (int(_b[2]), int(_b[3])), (0, 255, 0), 1) # cv2.imshow("image", image_show) # cv2.waitKey(0) image_data = np.array(image_data) box_data = np.array(box_data) # print(image_data.shape, box_data.shape) y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes) yield [image_data, *y_true], np.zeros(batch_size) def gen_yolo_puzzle(paths, batch_size, input_shape, anchors, num_classes, box_num=1): """data generator for fit_generator""" n = len(paths) data_path = os.path.dirname(os.path.abspath(__file__)) + "/../data/detect2/" i = 0 while True: image_data = [] box_data = [] batch_cnt = 0 while batch_cnt < batch_size: try: if i == 0: np.random.shuffle(paths) ss = paths[i][:-1].split(" ") image_path = ss[0] image = cv2.imread(data_path+image_path) image = pil_resize(image, input_shape[0], input_shape[1]) image_show = copy.deepcopy(image) image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) image = 255. - image image = np.uint8(image) # cv2.imshow("image", image) # cv2.waitKey(0) image = np.expand_dims(image, -1) image = image / 255. box = np.array([np.array(list(map(int, box.split(',')))) for box in ss[1:]]) # box数不同,复制 if box.shape[0] < box_num: box = np.concatenate([box, box[:2, :]], axis=0) # show # box_show = box.tolist() # for b in box_show: # print("box", b) # cv2.rectangle(image_show, (b[0], b[1]), (b[2], b[3]), (0, 0, 255), 2) # cv2.imshow("image_show", image_show) # cv2.waitKey(0) image_data.append(image) box_data.append(box) i = (i+1) % n batch_cnt += 1 except: i = (i+1) % n continue # print # print(image.shape) # image_show = (image*255).astype(np.uint8) # for _b in box: # print(_b) # cv2.rectangle(image_show, (int(_b[0]), int(_b[1])), (int(_b[2]), int(_b[3])), (0, 255, 0), 1) # cv2.imshow("image", image_show) # cv2.waitKey(0) image_data = np.array(image_data) box_data = np.array(box_data) # print(image_data.shape, box_data.shape) y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes) yield [image_data, *y_true], np.zeros(batch_size) def gen_drag(paths, batch_size=32, shape=(128, 256), cls_num=2): num = len(paths) data_path = os.path.dirname(os.path.abspath(__file__)) + "/../data/drag/" map_path = data_path+"map.txt" with open(map_path, "r") as f: _list = f.readlines() map_dict = {} for s in _list: ss = s[:-1].split(" ") map_dict[ss[0]] = ss[1] i = 0 random.shuffle(paths) while True: if i >= num: i = 0 random.shuffle(paths) height, width = shape[:2] if len(shape) > 2: channel = 3 else: channel = 1 X = np.zeros((batch_size, height, width, channel)) Y = np.zeros((batch_size, height, width, 1)) for j in range(batch_size): if i >= num: random.shuffle(paths) i = 0 path = paths[i].split(os.sep)[-1] w_index = int(map_dict.get(path)) # label = np.zeros(cls_num) # print("char_index", char_index) # label[w_index] = 1 # print("label", np.argmax(label), char_index) img1 = cv2.imread(data_path + path) img1 = pil_resize(img1, shape[0], shape[1]) # cv2.imshow("image", img1) label = np.full((shape[0], shape[1], 1), 0, dtype='uint8') label[:, w_index, 0] = 1 # label[:, w_index, 1] = 1 # cv2.imshow("label", np.expand_dims(label[..., 0], -1)) # cv2.waitKey(0) img1 = img1 / 255. # img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) # img1 = np.expand_dims(img1, axis=-1) i += 1 X[j] = img1 Y[j] = label yield X, Y def gen_phrase(map_list, batch_size=32, shape=(5707, 3)): voc_dim, timesteps = shape[:2] data_list = [] for line in map_list: data_list.append([eval(line[:-3]), int(line[-2:-1])]) num = len(data_list) i = 0 random.shuffle(data_list) while True: X = np.zeros((batch_size, timesteps)) Y = np.zeros((batch_size, 1)) for j in range(batch_size): if i >= num: random.shuffle(data_list) i = 0 data = data_list[i] d_list = [x for x in data[0]] d_list = d_list + [voc_dim]*(timesteps-len(d_list)) X[j] = np.array(d_list) Y[j] = data[1] i += 1 yield X, Y def gen_equation(paths, batch_size=32, shape=(40, 40), input_len=21, label_len=8, cls_num=6270, data_path='equation'): num = len(paths) data_path = os.path.dirname(os.path.abspath(__file__)) + "/../data/" + data_path + "/" map_path = os.path.dirname(os.path.abspath(__file__)) + "/../data/equation.txt" with open(map_path, "r") as f: map_list = f.readlines() map_str = "".join(map_list) map_str = re.sub("\n", "", map_str) char_map_dict = { "星": '*', "斜": "/", "问": "?", 'x': '×', '?': '?' } i = 0 random.shuffle(paths) while True: if i >= num: i = 0 random.shuffle(paths) height, width = shape[:2] if len(shape) > 2 and shape[2] == 3: channel = 3 else: channel = 1 X = np.zeros((batch_size, height, width, channel)) Y = np.zeros((batch_size, label_len)) input_length = np.ones(batch_size) * input_len label_length = np.ones(batch_size) * label_len j = 0 while j < batch_size: if i >= num: random.shuffle(paths) i = 0 path = paths[i].split(os.sep)[-1] char_index_list = [] for c in path.split(".")[0].split('_')[1:]: if c in char_map_dict.keys(): c = char_map_dict.get(c) if not c: continue # print("c", c) char_index_list.append(map_str.index(c)+1) char_index_list.extend([0] * (label_len-len(char_index_list))) label = np.array(char_index_list) img1 = cv2.imread(data_path + path) img1 = pil_resize(img1, shape[0], shape[1]) # img2 = copy.deepcopy(img1) # cv2.imshow("origin", img2) # cv2.imshow("gen_char", img1) # cv2.waitKey(0) img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) img1 = np.expand_dims(img1, axis=-1) img1 = img1 / 255. X[j] = img1 Y[j] = label i += 1 j += 1 yield [X, Y, input_length, label_length], np.ones(batch_size) def gen_equation2(paths, batch_size=32, shape=(40, 40), input_len=21, label_len=8, cls_num=6270, data_path='equation'): os.environ["CUDA_VISIBLE_DEVICES"] = "-1" image_shape = (32, 192, 1) weights_path = "./models/e153-loss53.97-denoise.h5" model = u_net_denoise(input_shape=image_shape, class_num=image_shape[2]) model.load_weights(weights_path) os.environ["CUDA_VISIBLE_DEVICES"] = "0" data_path = os.path.dirname(os.path.abspath(__file__)) + "/../data/" + data_path + "/" map_path = os.path.dirname(os.path.abspath(__file__)) + "/../data/equation.txt" with open(map_path, "r") as f: map_list = f.readlines() map_str = "".join(map_list) map_str = re.sub("\n", "", map_str) blank_index = len(map_str)+1 char_map_dict = { "星": '*', "斜": "/", "问": "?", 'x': '×', '?': '?' } i = 0 while True: height, width = shape[:2] if len(shape) > 2 and shape[2] == 3: channel = 3 else: channel = 1 X = np.zeros((batch_size, height, width, channel)) Y = np.zeros((batch_size, label_len)) input_length = np.ones(batch_size) * input_len label_length = np.ones(batch_size) * label_len noise = random.choice([False, True, True]) result_list = generate_data_equation2(batch_size, noise=noise) # 模型降噪 img_list = [np.expand_dims(cv2.cvtColor(pil_resize(x[0], image_shape[0], image_shape[1]), cv2.COLOR_BGR2GRAY), axis=-1) for x in result_list] img_data = np.array(img_list) / 255. pred = model.predict(img_data) img_list = np.uint8(pred*255.) # for j in range(len(result_list)): # img1 = img_list[j] # img1 = add_contrast(img1) # ratio = get_image_legal(img1) # if ratio >= 0.02: for j in range(len(result_list)): if random.choice([0, 1, 1, 1]): img1 = img_list[j] img1 = add_contrast(img1) else: img1 = result_list[j][0] img1 = pil_resize(img1, shape[0], shape[1]) img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) img1 = np.expand_dims(img1, axis=-1) # img1 = cv2.adaptiveThreshold(img1, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, # 11, 20) # img1 = np.expand_dims(img1, axis=-1) # cv2.imshow("origin", img1) # if noise: # if random.choice([0, 1]): # img1 = 255 - img1 # _, img1 = cv2.threshold(img1, 110, 255, cv2.THRESH_BINARY) char_list = result_list[j][1] if char_list[0] is None: # char_index_list = [0] * len(char_list) # print("len(map_str)+1", len(map_str)+1) char_index_list = [blank_index] * len(char_list) else: char_index_list = [] for c in char_list: if c in char_map_dict.keys(): c = char_map_dict.get(c) if not c: continue # print("c", c) # char_index_list.append(map_str.index(c)+1) char_index_list.append(map_str.index(c)) char_index_list.extend([blank_index] * (label_len-len(char_index_list))) label = np.array(char_index_list) ratio = get_image_legal(img1) # img1 = pil_resize(img, shape[0], shape[1]) # img2 = copy.deepcopy(img1) # print(char_list) # print(char_index_list) # print("ratio", ratio) # cv2.imshow("gen_char", img1) # cv2.waitKey(0) # img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) # img1 = np.expand_dims(img1, axis=-1) img1 = img1 / 255. X[j] = img1 Y[j] = label i += 1 yield [X, Y, input_length, label_length], np.ones(batch_size) def gen_equation_denoise(paths, batch_size=32, shape=(40, 40)): while True: height, width = shape[:2] if len(shape) > 2 and shape[2] == 3: channel = 3 else: channel = 1 X = np.zeros((batch_size, height, width, channel)) Y = np.zeros((batch_size, height, width, channel)) result_list = generate_data_denoise(batch_size) j = 0 for img, noise in result_list: img1 = pil_resize(img, shape[0], shape[1]) img2 = pil_resize(noise, shape[0], shape[1]) # img1 = add_contrast(img1) # img2 = add_contrast(img2) # cv2.imshow("origin", img2) img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) img1 = np.expand_dims(img1, axis=-1) img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY) img2 = np.expand_dims(img2, axis=-1) # cv2.imshow("bg", img1) # cv2.imshow("noise", img2) # cv2.waitKey(0) img1 = img1 / 255. img2 = img2 / 255. X[j] = img2 Y[j] = img1 j += 1 yield X, Y def generate_data_siamese(char_num=6, char_shape=(40, 40)): bg_paths = glob("../data/base/*") char_path = "../data/chinese_2500.txt" with open(char_path, "r") as f: char_str = f.read() data_dir = "../data/click/" for i in range(1000): bg = cv2.imread(random.sample(bg_paths, 1)[0]) char_list = [char_str[x] for x in random.sample(range(len(char_str)), char_num)] image_np, position_list, tips_image_list = char_on_image(bg, char_list, char_shape) # for j in range(len(char_list)): # char = char_list[j] # p = position_list[j] # print(char) # cv2.rectangle(image_np, [p[1], p[0]], [p[1]+char_shape[1], p[0]+char_shape[0]], (255, 0, 0), 2) # cv2.imshow("generate_data", image_np) # cv2.waitKey(0) # 保存tips图片 tips_path_list = [] for k in range(len(tips_image_list)): tips_image = tips_image_list[k] tips_path = str(i) + "_" + str(k) + ".jpg" tips_path_list.append(tips_path) cv2.imwrite(data_dir+tips_path, tips_image) # 保存文字区域图片 char_path_list = [] for j in range(len(char_list)): p = position_list[j] char_path = str(i) + "_" + str(len(tips_image_list)+j) + ".jpg" char_path_list.append(char_path) cv2.imwrite(data_dir+char_path, image_np[p[0]:p[0]+char_shape[0], p[1]:p[1]+char_shape[1], :]) # 生成映射数据 with open("../data/click/map.txt", "a") as f: for j in range(len(tips_path_list)): tips_path = tips_path_list[j] for k in range(len(char_path_list)): char_path = char_path_list[k] if j == k: f.write(tips_path + "\t" + char_path + "\t" + str(1) + "\n") else: f.write(tips_path + "\t" + char_path + "\t" + str(0) + "\n") def generate_data_char(char_num=6, char_shape=(40, 40), image_shape=(160, 260)): # (40,40) (160, 260) # (80,80) (360, 590) bg_paths = glob("../data/base/*") + glob("../data/base1/*")*100 char_path = "../data/chinese_simple_5649.txt" with open(char_path, "r") as f: char_list = f.readlines() # random.shuffle(char_list) char_str = "".join(char_list) char_str = re.sub("\n", "", char_str) data_dir = "../data/click_simple/" # 每个字生成多张图片 start_time = time.time() # 0- 1520 4000- 5760 for i in range(0, 6000): if i % 20 == 0: print("Loop", i, time.time()-start_time) start_time = time.time() char = char_str[i] # 生成带背景图数 image_cnt = 1 char_list = [char] * image_cnt tips_cnt = 0 image_cnt = 265 tips_list = [] for l in range(10): try: # 背景图 bg = cv2.imread(random.sample(bg_paths, 1)[0]) if random.choice([0, 0, 1]): bg = image_enhance_distort(bg) if random.choice([0, 0, 1]): bg = image_enhance_flip(bg) # 生成4张tips图,6张带背景的旋转图 image_np, p_list, t_list = char_on_image(bg, char_list, char_shape, image_shape, char_stretch=False) # cv2.imshow("char_on_image", image_np) # cv2.waitKey(0) tips_list += t_list for p in p_list: char_path = str(i) + "_" + str(image_cnt) + "_2" + ".jpg" # 放大缩小box if random.choice([1, 1, 1, 0]): threshold = random.randint(12, 25) y1 = max(0, p[0]-random.choice([0, threshold])) y2 = min(image_np.shape[0], p[0]+char_shape[0]+random.choice([0, threshold])) x1 = max(0, p[1]-random.choice([0, threshold])) x2 = min(image_np.shape[1], p[1]+char_shape[1]+random.choice([0, threshold])) # print(threshold) else: threshold = random.randint(4, 8) y1 = p[0]+random.choice([0, threshold]) y2 = p[0]+char_shape[0]-random.choice([0, threshold]) x1 = p[1]+random.choice([0, threshold]) x2 = p[1]+char_shape[1]-random.choice([0, threshold]) # 平移box if random.choice([0, 0, 1]): threshold = random.randint(8, 13) h_flag, w_flag = 0, 0 if random.choice([0, 1]): threshold = -threshold if random.choice([0, 1, 1]): h_flag = 1 if random.choice([0, 1, 1]): w_flag = 1 y1 = p[0]+threshold*h_flag y2 = p[0]+char_shape[0]+threshold*h_flag x1 = p[1]+threshold*w_flag x2 = p[1]+char_shape[1]+threshold*w_flag sub_image_np = image_np[y1:y2, x1:x2, :] if sub_image_np.shape[0] == 0 or sub_image_np.shape[1] == 0: continue sub_image_np = pil_resize(sub_image_np, char_shape[0], char_shape[1]) # cv2.imshow("sub_image_np", sub_image_np) # cv2.waitKey(0) cv2.imwrite(data_dir+char_path, sub_image_np) image_cnt += 1 except: print(1) continue for tips_image in tips_list[:1]: tips_path = str(i) + "_" + str(tips_cnt) + "_1" + ".jpg" # cv2.imshow("tips_image", tips_image) # cv2.waitKey(0) cv2.imwrite(data_dir+tips_path, tips_image) tips_cnt += 1 def generate_data_char_from_yolo(): _dir = "D:/Project/captcha/data/detect/" map_path = _dir + "map.txt" with open(map_path, "r") as f: map_list = f.readlines() char_image_list = [] for line in map_list[:10]: image_path = _dir + line.split(" ")[0] boxes = line[:-1].split(" ")[1:] boxes = [list(map(int, box.split(','))) for box in boxes] print(image_path, boxes) image = cv2.imread(image_path) for box in boxes: char_image = image[box[1]:box[3], box[0]:box[2], :] char_image_list.append(char_image) return def generate_data_yolo_char(image_shape=(160, 256)): bg_paths = glob("../data/base1/*") char_path = "../data/chinese_6270.txt" with open(char_path, "r") as f: char_map_list = f.readlines() random.shuffle(char_map_list) data_dir = "../data/detect/" # with open(data_dir+"map.txt", "w") as f: # f.write("") j = 0 start_time = time.time() for i in range(46121, 46500): if i % 50 == 0: print("Loop", i, time.time()-start_time) start_time = time.time() if j >= len(char_map_list)-1: j = 0 random.shuffle(char_map_list) # 随机底图大小 r = random.randint(160, 256) # random_image_shape = (r, random.choice([r*3, int(r*2.5), int(r*3.5), r*4])) random_image_shape = (r, r*3) # 随机shape和num r = random.randint(30, 40) char_shape = (r, r) char_num = random.randint(4, 6) tips_char_num = min(random.randint(3, char_num), 4) # 背景 try: bg = cv2.imread(random.sample(bg_paths, 1)[0]) # bg = pil_resize(bg, random_image_shape[0], random_image_shape[1]) if random.choice([0, 0, 0]): bg = image_enhance_distort(bg) if random.choice([0, 0, 0]): bg = image_enhance_flip(bg) except: print(1) continue if bg is None: print(2) continue # 随机取char char_list = [x[:-1] for x in char_map_list[j:j+char_num]] # print("char_list", char_list) j = j+char_num # 生成图 image_np, position_list, tips_image_list = char_on_image(bg, char_list, char_shape, random_image_shape, tips_char_num) if i < 5000: tips_image_np, tips_position_list = get_tips_image(tips_image_list, char_shape, random_image_shape) image_np_path = str(i) + ".jpg" # print(image_np.shape) image_np = pil_resize(image_np, image_shape[0], image_shape[1]) cv2.imwrite(data_dir+image_np_path, image_np) if i < 5000: tips_image_np_path = str(i) + "_0.jpg" tips_image_np = pil_resize(tips_image_np, image_shape[0], image_shape[1]) cv2.imwrite(data_dir+tips_image_np_path, tips_image_np) # 生成映射数据 with open(data_dir+"map.txt", "a") as f: box_str = "" for p in position_list: x1 = int(p[1] * image_shape[1] / random_image_shape[1]) y1 = int(p[0] * image_shape[0] / random_image_shape[0]) x2 = int((p[1]+char_shape[1]) * image_shape[1] / random_image_shape[1]) y2 = int((p[0]+char_shape[0]) * image_shape[0] / random_image_shape[0]) threshold = random.randint(2, 4) if random.choice([0, 1]): y1 = max(0, y1 - threshold) y2 = min(image_np.shape[0], y2 + threshold) x1 = max(0, x1 - threshold) x2 = min(image_np.shape[1], x2 + threshold) # print(threshold) box_str += str(x1) + "," + str(y1) + "," + \ str(x2) + "," + str(y2) + "," + \ str(0) + " " # cv2.rectangle(image_np, (x1, y1), (x2, y2), (255, 0, 0), 2) # cv2.imshow("image_np", image_np) # cv2.waitKey(0) box_str = box_str[:-1] f.write(image_np_path + " " + box_str + "\n") if i < 5000: box_str = "" x1 = int(p[1] * image_shape[1] / random_image_shape[1]) y1 = int(p[0] * image_shape[0] / random_image_shape[0]) x2 = int((p[1]+char_shape[1]) * image_shape[1] / random_image_shape[1]) y2 = int((p[0]+char_shape[0]) * image_shape[0] / random_image_shape[0]) for p in tips_position_list: box_str += str(x1) + "," + str(y1) + "," + \ str(x2) + "," + str(y2) + "," + \ str(0) + " " cv2.rectangle(tips_image_np, (x1, y1), (x2, y2), (255, 0, 0), 2) # cv2.imshow("tips_image_np", tips_image_np) box_str = box_str[:-1] f.write(tips_image_np_path + " " + box_str + "\n") def generate_data_yolo_char_1(image_shape=(160, 256)): bg_paths = glob("../data/base/*") char_path = "../data/chinese_6270.txt" with open(char_path, "r") as f: char_map_list = f.readlines() data_dir = "../data/detect/" # with open(data_dir+"map.txt", "w") as f: # f.write("") j = 0 for i in range(20000, 25000): if i % 1000 == 0: print("Loop", i) if j >= len(char_map_list)-1: j = 0 random.shuffle(char_map_list) # 随机shape和num r = random.randint(31, 51) char_shape = (r, r) char_num = random.randint(3, 6) tips_char_num = 0 # 背景 bg = cv2.imread(random.sample(bg_paths, 1)[0]) if random.choice([0, 0, 1]): bg = image_enhance_distort(bg) if random.choice([0, 0, 1]): bg = image_enhance_flip(bg) # 随机取char char_list = [x[:-1] for x in char_map_list[j:j+char_num]] # print("char_list", char_list) j = j+char_num # 生成图 image_np, position_list, tips_image_list = char_on_image(bg, char_list, char_shape, image_shape, tips_char_num, only_color=(0, 0, 0)) image_np_path = str(i) + ".jpg" cv2.imwrite(data_dir+image_np_path, image_np) # 生成映射数据 with open(data_dir+"map.txt", "a") as f: box_str = "" for p in position_list: box_str += str(p[1]) + "," \ + str(p[0]) + "," \ + str(p[1]) + "," \ + str(p[0]) + "," \ + str(0) + " " # cv2.rectangle(image_np, (p[1], p[0]), (p[1]+char_shape[1], p[0]+char_shape[0]), (255, 0, 0), 2) # cv2.imshow("image_np", image_np) # cv2.waitKey(0) box_str = box_str[:-1] f.write(image_np_path + " " + box_str + "\n") def generate_data_yolo_puzzle(image_shape=(160, 256)): bg_paths = glob("../data/base/*.jpeg") data_dir = "../data/detect2/" with open(data_dir+"map.txt", "w") as f: f.write("") for i in range(0, 10000): if i % 1000 == 0: print("Loop", i) bg = cv2.imread(random.sample(bg_paths, 1)[0]) if random.choice([0, 0, 1]): bg = image_enhance_distort(bg) if random.choice([0, 0, 1]): bg = image_enhance_flip(bg) r = random.randint(35, 60) puzzle_shape = (r, r) image_np, position_list = puzzle_on_image(bg, puzzle_shape, image_shape) image_np_path = str(i) + ".jpg" cv2.imwrite(data_dir+image_np_path, image_np) # 生成映射数据 with open(data_dir+"map.txt", "a") as f: box_str = "" for p in position_list: box_str += str(p[1]) + "," + str(p[0]) + "," + \ str(p[1]+puzzle_shape[1]) + "," + str(p[0]+puzzle_shape[0]) + \ "," + str(0) + " " # cv2.rectangle(image_np, (p[1], p[0]), (p[1]+puzzle_shape[1], p[0]+puzzle_shape[0]), (255, 0, 0), 2) # cv2.imshow("image_np", image_np) # cv2.waitKey(0) box_str = box_str[:-1] f.write(image_np_path + " " + box_str + "\n") def generate_data_drag_image(image_shape=(160, 260)): bg_paths = glob("../data/base/*") data_dir = "../data/drag/" with open(data_dir+"map.txt", "w") as f: f.write("") for i in range(10000): if i % 1000 == 0: print("Loop", i) bg = cv2.imread(random.sample(bg_paths, 1)[0]) bg = pil_resize(bg, image_shape[0], image_shape[1]) if random.choice([0, 0, 1]): bg = image_enhance_distort(bg) if random.choice([0, 0, 1]): bg = image_enhance_flip(bg) image_np, clip_line = get_drag_image(bg) image_np_path = str(i) + ".jpg" cv2.imwrite(data_dir+image_np_path, image_np) # 生成映射数据 with open(data_dir+"map.txt", "a") as f: f.write(image_np_path + " " + str(clip_line[0][0]) + "\n") def generate_data_phrase(): data_path = os.path.dirname(os.path.abspath(__file__)) + "/../data/phrase/" char_path = data_path+"char.txt" with open(char_path, "r") as f: char_list = f.readlines() char_dict = {} for i in range(len(char_list)): char_dict[char_list[i]] = i phrase_list = [] phrase_path = data_path+"phrase3.txt" with open(phrase_path, "r") as f: phrase_list += f.readlines() phrase_path = data_path+"phrase4.txt" with open(phrase_path, "r") as f: phrase_list += f.readlines() phrase_path = data_path+"phrase5.txt" with open(phrase_path, "r") as f: phrase_list += f.readlines() phrase_set = set(phrase_list) map_path = data_path+"map3.txt" with open(map_path, "w") as f: f.write("") data_list = [] start_time = time.time() i = 0 negative_way_flag = False for phrase in phrase_list: if i % 500000 == 0: with open(map_path, "a") as f: f.writelines(data_list) data_list = [] print("Loop", i, len(phrase_list), time.time()-start_time) start_time = time.time() i += 1 # 正样本 index_list = [] for char in phrase[:-1]: index_list.append(char_dict.get(char+"\n")) data_list.append(str(index_list) + " 1\n") # 负样本 if negative_way_flag: index1 = random.randint(0, len(index_list)-1) find_flag = False while not find_flag: index2 = random.randint(0, len(index_list)-1) if index1 != index2: find_flag = True temp = index_list[index1] index_list[index1] = index_list[index2] index_list[index2] = temp if "".join([char_list[x][:-1] for x in index_list]) + "\n" not in phrase_set: data_list.append(str(index_list) + " 0\n") else: products = list(product(index_list, repeat=len(index_list))) random.shuffle(products) negative_cnt = 0 for p in products: if negative_cnt >= 2: break p = list(p) if len(set(p)) != len(p): continue if p != index_list and "".join([char_list[x][:-1] for x in p]) + "\n" not in phrase_set: data_list.append(str(p) + " 0\n") negative_cnt += 1 with open(map_path, "a") as f: f.writelines(data_list) def generate_data_phrase_raw(word_len=5): paths = glob("D:/Chinese_corpus/answer/*/*.txt") phrase_path = "../data/phrase/phrase" + str(word_len) + "_new.txt" triple_list = [] reg = "[^\u4e00-\u9fa5]" start_time = time.time() for i in range(len(paths)): if i % 1000 == 0: with open(phrase_path, "w") as f: f.writelines(triple_list) print("Loop", i, len(paths), time.time()-start_time) start_time = time.time() triple_list = [] with open(paths[i], "rb") as f: _b = f.read() try: text = _b.decode("gbk") except: try: text = _b.decode("gb2312") except: try: text = _b.decode("gb18030") except: print(chardet.detect(_b), "is None") filter_word = ["的"] for word in filter_word: text = re.sub(word, "#"*len(word), text) word_list = jieba.lcut(text, cut_all=False, HMM=True) for j in range(1, len(word_list)): current = word_list[j] current_re = re.search(reg, current) last = word_list[j-1] last_re = re.search(reg, last) if current_re: continue if len(current) == word_len: triple_list.append(current + "\n") elif len(current) + len(last) == word_len and not last_re: triple_list.append(last+current + "\n") triple_list = list(set(triple_list)) print("len(triple_list)", len(triple_list)) with open(phrase_path, "w") as f: f.writelines(triple_list) def generate_data_equation(): char_dict = { 1: ['1', '一'], 2: ['2', '二'], 3: ['3', '三'], 4: ['4', '四'], 5: ['5', '五'], 6: ['6', '六'], 7: ['7', '七'], 8: ['8', '八'], 9: ['9', '九'], 0: ['0', '零'], '+': ['+', '加', '加上'], '-': ['-', '减', '减去'], '*': ["*", "×", 'x', '乘', '乘以'], '/': ["/", '除', '÷'], '?': ['?', '?'], '去': ['去'], '上': ['上'], '以': ['以'], } only_op_dict = { '+': ['+', '加'], '-': ['-', '减'], '*': ["*", "×", 'x', '乘'], '/': ["/", '除', '÷'], } file_name_dict = { '*': "星", "/": "斜", "?": "问", } data_dir = "../data/equation/" # 每个字生成多张图片 start_time = time.time() # 0- 1520 4000- 5760 for i in range(0, 100): if i % 20 == 0: print("Loop", i, time.time()-start_time) start_time = time.time() if random.choice([1, 1]): # 随机生成算式 char_list = [] result = -1 op = random.choice(['+', '-', '*', '/']) while result < 0: n1 = random.choice([random.randint(0, 9), random.randint(10, 99)]) n2 = random.choice([random.randint(0, 9), random.randint(10, 99)]) if op == '-': result = n1 - n2 else: result = 1 if len(str(n1)) > 1: n1 = [x for x in str(n1)] else: n1 = [random.choice(char_dict[n1])] if len(str(n2)) > 1: n2 = [x for x in str(n2)] else: n2 = [random.choice(char_dict[n2])] op = random.choice(char_dict[op]) if len(op) > 1: op = [x for x in op] else: op = [op] char_list.extend(n1) char_list.extend(op) char_list.extend(n2) if random.choice([0, 0, 0, 1]): char_list.append('=') char_list.append(random.choice([''] + char_dict['?'])) else: # 随机生成非算式 char_list = random.sample(list(char_dict.keys())+list(only_op_dict.keys())*2, random.randint(3, 6)) char_list = [random.choice(char_dict[x]) for x in char_list] if random.choice([0, 0]): # 生成背景图 big_flag = 0 if random.choice([0, 1]): h = random.randint(50, 60) w = random.randint(200, 280) else: h = random.randint(120, 150) w = random.randint(400, 680) big_flag = 1 bg = np.full((h, w, 3), random.randint(180, 255), dtype=np.uint8) current_w = random.randint(0, int(w/3)) for char in char_list: # 获取单字图片 if big_flag: r = random.randint(80, 120) else: r = random.randint(35, 45) char_image_pil = get_char_image2(char, (r, r), rotate=False) # 字体添加到背景图上 position_h = random.randint(0, bg.shape[0]-r) try: position_w = random.randint(current_w-15, min(current_w+random.randint(0, 5), bg.shape[1]-r)) except: position_w = current_w current_w = position_w+r bg = get_image_paste(bg, char_image_pil, position_h, position_w) # 增加噪声 bg = create_noise(bg) else: fig_size = random.choice([(70, 25), (100, 26), (100, 40)]) line = (1, 4) wavy = (1, 3) font_size = random.choice([(18, 25), (35, 40)]) # 定制 fig_size = (90, 34) line = (1, 4) wavy = (0, 0) font_size = (38, 50) offset_w = (-10, -2) bg = generate_data_equation_lsm("".join(char_list), fig_size=fig_size, font_color=random.choice([(20, 230, 20, 230, 20, 230), (70, 100), (10, 230, 10, 230, 10, 230), ]), font_size=font_size, rotate=random.choice([0, ]), shortline=(1, 5), line=line, line_width=(1, 3), wavy=wavy, offset_w=offset_w, offset_h=5, same_color=0, ) bg = pil2np(bg) bg = create_noise_short_lines(bg, random.randint(3, 8)) # 数据增强 if random.choice([0, 1]): bg = np2pil(bg) aug = random.choice([0, 1, 2, 3, 5]) if aug == 0: bg = image_enhance_color(bg) elif aug == 1: bg = image_enhance_brightness(bg) elif aug == 2: bg = image_enhance_contrast(bg) elif aug == 3: bg = image_enhance_sharpness(bg) elif aug == 4: bg = image_enhance_blur(bg) bg = pil2np(bg) if aug == 5: bg = image_enhance_distort(bg) # print("aug", aug) # show cv2.imshow("bg", bg) cv2.waitKey(0) # write for j in range(len(char_list)): if char_list[j] in file_name_dict.keys(): char_list[j] = file_name_dict.get(char_list[j]) image_path = str(i) + "_" + "_".join(char_list) + ".jpg" cv2.imwrite(data_dir+image_path, bg) def generate_data_equation2(batch_size, noise=True): char_dict = { 1: ['1', '一'], 2: ['2', '二'], 3: ['3', '三'], 4: ['4', '四'], 5: ['5', '五'], 6: ['6', '六'], 7: ['7', '七'], 8: ['8', '八'], 9: ['9', '九'], 0: ['0', '零'], '+': ['+', '加', '加上'], '-': ['-', '减', '减去'], '*': ["*", "×", 'x', '乘', '乘以'], '/': ['除', '÷'], '?': ['?', '?'], '去': ['去'], '上': ['上'], '以': ['以'], } only_op_dict = { '+': ['+', '加', '加上'], '-': ['-', '减', '减去'], '*': ["*", "×", 'x', '乘', '乘以'], '/': ['除', '÷'], } # 每个字生成多张图片 result_list = [] i = 0 while i < batch_size: # for i in range(0, batch_size): simple_flag = random.choice([0, 1]) # empty_flag = random.choice([0, 0, 0, 0, 0, 1, 0, 0, 0, 0]) empty_flag = 0 if random.choice([1, 1]): # 随机生成算式 char_list = [] # result = -1 op = random.choice(['+', '-', '*']) # while result < 0: if not simple_flag: n1 = random.choice([random.randint(0, 9), random.randint(10, 99)]) n2 = random.choice([random.randint(0, 9), random.randint(10, 99)]) # if op == '-': # result = n1 - n2 # else: # result = 1 if len(str(n1)) > 1: n1 = [x for x in str(n1)] else: n1 = [random.choice(char_dict[n1])] if len(str(n2)) > 1: n2 = [x for x in str(n2)] else: n2 = [random.choice(char_dict[n2])] else: n1 = random.randint(0, 9) n2 = random.randint(0, 9) n1 = [random.choice(char_dict[n1])] n2 = [random.choice(char_dict[n2])] op = random.choice(char_dict[op]) # if simple_flag: # op = op[0] if len(op) > 1: op = [x for x in op] else: op = [op] char_list.extend(n1) char_list.extend(op) char_list.extend(n2) if random.choice([0, 0, 0, 1]): char_list.append('=') if random.choice([0, 1]) and not simple_flag: char_list.append(random.choice(char_dict['?'])) else: # 随机生成非算式 char_list = random.sample(list(char_dict.keys())+list(only_op_dict.keys())*2, random.randint(3, 6)) char_list = [random.choice(char_dict[x]) for x in char_list] new_char_list = [] for c in char_list: if len(c) > 1: new_char_list.extend([x for x in c]) else: new_char_list.append(c) char_list = new_char_list char_list = char_list[:8] if random.choice([0, 0]): # 生成背景图 big_flag = 0 if random.choice([0, 1]): h = random.randint(50, 60) w = random.randint(200, 280) else: h = random.randint(120, 150) w = random.randint(400, 680) big_flag = 1 bg = np.full((h, w, 3), random.randint(180, 255), dtype=np.uint8) current_w = random.randint(-10, 1) for char in char_list: # 获取单字图片 if big_flag: r = random.randint(80, 120) else: r = random.randint(35, 45) char_image_pil = get_char_image2(char, (r, r), rotate=False) # 字体添加到背景图上 position_h = random.randint(0, bg.shape[0]-r) try: position_w = random.randint(current_w-15, min(current_w+random.randint(0, 5), bg.shape[1]-r)) except: position_w = current_w current_w = position_w+r bg = get_image_paste(bg, char_image_pil, position_h, position_w) # 增加噪声 bg = create_noise(bg) else: if random.choice([0, 1]): fig_size = random.choice([(70, 25), (100, 26), (100, 40)]) line = (0, 0) wavy = (0, 0) font_size = random.choice([(18, 25), (35, 40)]) # offset_w = (-20, 6) else: # 定制 fig_size = (90, 34) line = (0, 0) wavy = (0, 0) font_size = (38, 50) offset_w = (0, 1) if simple_flag: fig_size = (random.randint(120, 240), 32) font_size = (18, 28) if empty_flag: bg = np.full((fig_size[1], fig_size[0], 3), fill_value=random.randint(125, 255), dtype=np.uint8) char_list = [None] * random.randint(3, 8) else: bg = generate_data_equation_lsm("".join(char_list), fig_size=fig_size, font_color=random.choice([(0, 100)]), font_size=font_size, rotate=random.choice([0, ]), shortline=(0, 0), line=line, line_width=(1, 3), wavy=wavy, offset_w=offset_w, offset_h=5, same_color=0, simple_flag=simple_flag ) bg = pil2np(bg) if get_image_legal(bg) < 0.02: continue if noise: bg = create_noise_point(bg, random.randint(15, 50)) bg = create_noise_short_lines(bg, random.randint(1, 5)) # 数据增强 if random.choice([0, 1]): bg = np2pil(bg) aug = random.choice([0, 1, 2, 3, 5]) if aug == 0: bg = image_enhance_color(bg) elif aug == 1: bg = image_enhance_brightness(bg) elif aug == 2: bg = image_enhance_contrast(bg) elif aug == 3: bg = image_enhance_sharpness(bg) elif aug == 4: bg = image_enhance_blur(bg) bg = pil2np(bg) if aug == 5: bg = image_enhance_distort(bg) # print("aug", aug) # show # cv2.imshow("bg", bg) # gray = cv2.cvtColor(bg, cv2.COLOR_BGR2GRAY) # gray = eight_neighbour(gray, 4) # cv2.imshow("gray", gray) # cv2.waitKey(0) result_list.append([bg, char_list]) i += 1 return result_list def generate_data_equation_lsm(text, fig_size=(200, 70), fonts=glob("../font/*"), font_color=(10, 100), same_color=1, font_size=(25, 35), rotate=0, font_noise=0, offset_w=(0, 0), offset_h=0, line=(0, 0), shortline=(0, 0), line_width=(0, 1), line_color=(200, 250), point=(0, 500), point_color=(150, 250), frame_color=None, wavy=(0, 0), bg=(200, 255), simple_flag=0): """ text:验证码文本 size:验证码图片宽高 fonts:字体列表,随机选择一个 font_noise: 字体散点干扰,0不加干扰,1加干扰 offset_hor: 左右偏移值 offset_var: 上下偏移值 fill:字体颜色范围 rotate:字体旋转角度 line:干扰线条数范围 point:干扰点数范围 wavy:波浪线数范围 color:干扰线、点 颜色 bg:背景色范围 """ def random_xy(width, height): """ 随机位置函数,返回指定范围随机位置坐标 参数:width:图片宽,height:图片高 """ x = random.randint(0, width) y = random.randint(0, height) return x, y def random_color(color_tuple): """ 随机颜色函数,返回指定范围随机颜色值 参数:start:颜色最低值,end:颜色最高值 """ if len(color_tuple)==2: rs, re = color_tuple gs = bs = rs ge = be = re else: rs, re, gs, ge, bs, be = color_tuple red = random.randint(rs, re) green = random.randint(gs, ge) blue = random.randint(bs, be) return (red, green, blue) def Asin(x, A=8, w=0.05, b=6, k=40): """ y=Asin(ωx+φ)+k在直角坐标系上的图象 A——振幅,当物体作轨迹符合正弦曲线的直线往复运动时,其值为行程的1/2。 (ωx+φ)——相位,反映变量y所处的状态。 φ——初相,x=0时的相位;反映在坐标系上则为图像的左右移动。 k——偏距,反映在坐标系上则为图像的上移或下移。 ω——角速度, 控制正弦周期(单位弧度内震动的次数)。 """ return A*math.sin(w*x+b)+k def get_wavy_line(w=(0, 100), h=(30, 50)): """产生波浪线坐标""" import random n = 50 x = 0 y = random.randint(h[0],h[1]) flag = random.randint(0,2) xy = [(x, y)] while x < w[1]: temp_y = random.randint(1, 3) temp_x = random.randint(5, 10) if flag == 0: if y + temp_y > h[1]: y -= temp_y flag = 1 else: y += temp_y else: if y - temp_y < h[0]: y += temp_y flag = 0 else: y -= temp_y x = x+temp_x if x+temp_x < w[1] else w[1] xy.append((x, y)) return xy def get_char_img(char, font, font_color, rotate, bg, font_noise=0): """ 生成单个字符图片,随机颜色加随机旋转 """ w, h = draw.textsize(char, font=font) im = Image.new('RGBA', (w, h), color=bg) ImageDraw.Draw(im).text((0,0), char, font=font, fill=font_color) if rotate and char not in ['+', '-', '×']: im = im.rotate(random.randint(-rotate, rotate), Image.BILINEAR, expand=1) im = im.crop(im.getbbox()) if font_noise: im_draw = ImageDraw.Draw(im) # for i in range(random.randint(1,20)): for i in range(random.randint(int(w*h*0.01),min(int(w*h*0.05), 5))): im_draw.point(xy=(random.randint(0, w), random.randint(0, h)),fill=bg) table = [] for i in range(256): table.append(i * 97) # 5.97 mask = im.convert('L').point(table) return (im, mask) bg = random_color(bg) img = Image.new(mode='RGB', size=fig_size, color=bg) draw = ImageDraw.Draw(im=img, mode='RGB') font_path = random.choice(fonts) # print("font_path", font_path) font_size1 = random.randint(font_size[0], font_size[1]) font = ImageFont.truetype(font_path, size=font_size1) # font=None, size=10, index=0, encoding="" rotate = random.randint(0, rotate) char_color = random_color(font_color) re_s = re.search('(\d+|\?)(\+|-|\*|×)(\d+|\?)(=)(-?\d+|\?)?', text) if re_s: # print(re_s.group(0)) char_imgs = [] char_list = [] if same_color: for i in range(1, 6): if re_s.group(i) is not None: char_list.append(re_s.group(i)) char_imgs.append(get_char_img(re_s.group(i), font, font_color=char_color, rotate=rotate, bg=bg, font_noise=font_noise)) else: for i in range(1, 6): if re_s.group(i) is not None: char_list.append(re_s.group(i)) char_imgs.append(get_char_img(re_s.group(i), font, font_color=random_color(font_color), rotate=rotate, bg=bg, font_noise=font_noise)) else: if same_color: char_imgs = [get_char_img(char, font, font_color=char_color, rotate=rotate, bg=bg, font_noise=font_noise) for char in text] else: char_imgs = [get_char_img(char, font, font_color=random_color(font_color), rotate=rotate, bg=bg, font_noise=font_noise) for char in text] ws = [img[0].size[0] for img in char_imgs] hs = [img[0].size[1] for img in char_imgs] w = max(sum(ws), fig_size[0]) h = max(max(hs), fig_size[1]) if w>fig_size[0] or h>fig_size[1]: img = Image.new('RGB', (w+6, h+6), color=bg) draw = ImageDraw.Draw(im=img, mode='RGB') # im, mode=None w, h = img.size fig_size = img.size # 短线 for i in range(random.randint(shortline[0], shortline[1])): x0, y0 = random_xy(w, h) x1 = x0 + random.randint(2, 5) y1 = y0 + random.randint(2, 5) draw.line(xy=((x0,y0),(x1,y1)), fill=random_color(line_color), width=random.randint(line_width[0], line_width[1])) # xy, fill=None, width=0 if rotate: temp_x = random.randint(0, min(570, int((fig_size[0]-sum(ws))/2+1))) # int((fig_size[0]-sum(ws))/5) # temp_y = random.randint(int((fig_size[1]-hs[0])/8), int((fig_size[1]-hs[0])/2+1)) temp_y = random.randint(0, int(fig_size[1]/4)) # print('len(char_imgs):',len(char_imgs)) for i in range(len(char_imgs)): # tmp_offset = random.randint(offset_w[0], offset_w[1]) if sum(ws)+(len(ws)-1)*offset_w[1] 0: temp_x = new_x temp_y = new_y # temp_x = new_x if new_x+ws[i] < fig_size[0] else temp_x+ws[i-1] # temp_y = new_y if 0 < new_y and new_y+hs[i] < fig_size[1] else random.randint(0, h-hs[i]+1) table = [] for _i in range(256): table.append(_i * 97) # 5.97 mask = img.crop((temp_x, temp_y, temp_x+char_imgs[i][0].size[0], temp_y+char_imgs[i][0].size[1])).convert('L').point(table) img.paste(char_imgs[i][0], box=(temp_x, temp_y), mask=mask) new_x = temp_x+ws[i]+tmp_offset new_y = temp_y+random.randint(-offset_h, offset_h) # 直线 for i in range(random.randint(line[0], line[1])): x0, y0 = random_xy(w, h) x1, y1 = random_xy(w, h) draw.line(xy=((x0, y0), (x1, y1)), fill=random_color(line_color), width=random.randint(line_width[0], line_width[1])) # 散点 # for i in range(random.randint(point[0], point[1])): # if font_color == point_color: # draw.point(xy=(random.randint(0, fig_size[0]), random.randint(0, fig_size[1])), # fill=char_color) # else: # draw.point(xy=(random.randint(0, fig_size[0]), random.randint(0, fig_size[1])), # fill=random_color(point_color)) if random.random() >= 0.5: A_ = random.uniform(hs[1]*0.1, hs[1]*0.2) w_ = math.pi*4/w # random.uniform(0.04, 0.06) b_ = random.random()*math.pi k_ = random.uniform(h*0.5, h*0.7) # 波浪线 for _ in range(random.randint(wavy[0],wavy[1])): draw.line(xy=[(x, Asin(x, A_, w_, b_, k_)) for x in range(int(w))], fill=char_color, width=random.randint(line_width[0], line_width[1])) else: # 波浪线 for _ in range(random.randint(wavy[0], wavy[1])): draw.line(xy=get_wavy_line(w=(0, w), h=(min(hs)-5, max(hs)+5)), fill=char_color, width=random.randint(line_width[0], line_width[1])) # 边框 if frame_color is not None: draw.line(xy=[(0, 0), (0, h), (0, 0), (w, 0), (w-1, 0), (w-1, h), (0, h-1), (w-1, h-1)], fill=random_color(frame_color)) if simple_flag: offset_w = (int(1/15*fig_size[0]), int(1/12*fig_size[0])) else: offset_w = (int(-1/5*font_size1), int(-1/10*font_size1)) if not rotate: # temp_x = random.randint(0, min(70, int((fig_size[0]-sum(ws))/2+1))) #int((fig_size[0]-sum(ws))/5) # temp_y = random.randint(int((fig_size[1]-hs[0])/8), int((fig_size[1]-hs[0])/2+1)) if simple_flag: temp_x = random.randint(0, int(fig_size[0]/8)) else: temp_x = random.randint(0, int(fig_size[0]/4)) temp_y = 0 for i in range(len(char_imgs)): # tmp_offset = random.randint(offset_w[0], offset_w[1]) if sum(ws)+(len(ws)-1)*offset_w[1] 0: temp_x = new_x temp_y = new_y # temp_x = new_x if new_x+ws[i] < fig_size[0] else temp_x+ws[i-1] # temp_y = new_y if 0 < new_y and new_y+hs[i] < fig_size[1] else temp_y if same_color: if re_s: draw.text((temp_x, temp_y), char_list[i], font=font, fill=char_color) else: draw.text((temp_x, temp_y), text[i], font=font, fill=char_color) else: if re_s: draw.text((temp_x, temp_y), char_list[i], font=font, fill=random_color(font_color)) else: draw.text((temp_x, temp_y), text[i], font=font, fill=random_color(font_color)) # new_x = temp_x+ws[i]+tmp_offset # print("new_x", new_x, temp_x, ws[i], tmp_offset) new_y = temp_y+random.randint(-offset_h, offset_h) # filename = font_name+"_"+str(uuid.uuid1())+"_"+text # img.save('/data/python/lsm/gen_captcha/{}.jpg'.format(filename)) # return img return img def generate_data_denoise(batch_size): char_dict = { 1: ['1', '一'], 2: ['2', '二'], 3: ['3', '三'], 4: ['4', '四'], 5: ['5', '五'], 6: ['6', '六'], 7: ['7', '七'], 8: ['8', '八'], 9: ['9', '九'], 0: ['0', '零'], '+': ['+', '加', '加上'], '-': ['-', '减', '减去'], '*': ["*", "×", 'x', '乘', '乘以'], '/': ["/", '除', '÷'], '?': ['?', '?'], '去': ['去'], '上': ['上'], '以': ['以'], } only_op_dict = { '+': ['+', '加', '加上'], '-': ['-', '减', '减去'], '*': ["*", "×", 'x', '乘', '乘以'], '/': ["/", '除', '÷'], } # 每个字生成多张图片 result_list = [] for i in range(0, batch_size): if random.choice([1, 1, 0]): # 随机生成算式 char_list = [] result = -1 op = random.choice(['+', '-', '*', '/']) while result < 0: n1 = random.choice([random.randint(0, 9), random.randint(10, 99)]) n2 = random.choice([random.randint(0, 9), random.randint(10, 99)]) if op == '-': result = n1 - n2 else: result = 1 if len(str(n1)) > 1: n1 = [x for x in str(n1)] else: n1 = [random.choice(char_dict[n1])] if len(str(n2)) > 1: n2 = [x for x in str(n2)] else: n2 = [random.choice(char_dict[n2])] op = random.choice(char_dict[op]) if len(op) > 1: op = [x for x in op] else: op = [op] char_list.extend(n1) char_list.extend(op) char_list.extend(n2) if random.choice([0, 0, 0, 1]): char_list.append('=') char_list.append(random.choice([''] + char_dict['?'])) else: # 随机生成非算式 char_list = random.sample(list(char_dict.keys())+list(only_op_dict.keys())*2, random.randint(3, 6)) char_list = [random.choice(char_dict[x]) for x in char_list] new_char_list = [] for c in char_list: if len(c) > 1: new_char_list.extend([x for x in c]) else: new_char_list.append(c) char_list = new_char_list char_list = char_list[:8] if random.choice([0, 1]): fig_size = random.choice([(70, 25), (100, 26), (100, 40)]) line = (0, 0) wavy = (0, 0) font_size = random.choice([(18, 25), (35, 40)]) # offset_w = (-20, 6) else: # 定制 fig_size = (90, 34) line = (0, 0) wavy = (0, 0) font_size = (38, 50) offset_w = (-8, -2) bg = generate_data_equation_lsm("".join(char_list), fig_size=fig_size, font_color=random.choice([(20, 230, 20, 230, 20, 230), (70, 100), (10, 230, 10, 230, 10, 230), ]), font_size=font_size, rotate=random.choice([0, ]), shortline=(0, 0), line=line, line_width=(0, 0), wavy=wavy, offset_w=offset_w, offset_h=5, same_color=0, ) bg = pil2np(bg) # print(1) noise = create_noise_point(bg, random.randint(15, 50)) # print(2) noise = create_noise_short_lines(noise, random.randint(1, 5)) # 数据增强 if random.choice([0, 1]): bg = np2pil(bg) noise = np2pil(noise) aug = random.choice([0, 1, 2, 3]) if aug == 0: r = random.uniform(.5, 6.) bg = image_enhance_color(bg, _range=(r, r)) noise = image_enhance_color(noise, _range=(r, r)) elif aug == 1: r = random.uniform(.3, 2.) bg = image_enhance_brightness(bg, (r, r)) noise = image_enhance_brightness(noise, (r, r)) elif aug == 2: r = random.uniform(.3, 6.) bg = image_enhance_contrast(bg, (r, r)) noise = image_enhance_contrast(noise, (r, r)) elif aug == 3: r = random.uniform(4., 8.) bg = image_enhance_sharpness(bg, (r, r)) noise = image_enhance_sharpness(noise, (r, r)) elif aug == 4: bg = image_enhance_blur(bg) noise = image_enhance_blur(noise) bg = pil2np(bg) noise = pil2np(noise) if aug == 5: bg = image_enhance_distort(bg) # print("aug", aug) # show # cv2.imshow("bg", bg) # cv2.imshow("noise", noise) # cv2.waitKey(0) result_list.append([bg, noise]) return result_list def char_on_image(image_np, char_list, char_shape, image_shape, tip_char_num=1, only_color=None, char_stretch=False): position_list = [] for char in char_list: # 获取单字图片 char_image_pil = get_char_image2(char, char_shape, char_color=only_color) image_np = pil_resize(image_np, image_shape[0], image_shape[1]) # h, w fg_w, fg_h = char_image_pil.size[:2] bg_h, bg_w = image_np.shape[:2] # 字体放置的位置,且位置不重叠 find_flag = 0 while not find_flag: position_h = random.randint(0, bg_h-fg_h) position_w = random.randint(0, bg_w-fg_w) if len(position_list) < 1: find_flag = 1 break for p in position_list: if get_iou(position_w, position_h, position_w+fg_w, position_h+fg_h, p[1], p[0], p[1]+fg_w, p[0]+fg_h) > 0: find_flag = 0 break else: find_flag = 1 position_list.append([position_h, position_w]) # 字体添加到背景图上 # image_np = get_image_roi(image_np, char_image_np, position_h, position_w) image_np = get_image_paste(image_np, char_image_pil, position_h, position_w, stretch=char_stretch) # 生成提示图片 image_list = [] for char in char_list[:tip_char_num]: char_image_pil = get_char_image2(char, char_shape, rotate=True, is_tips=True) char_image_np = pil2np_a(char_image_pil) # char_image_np = pil_resize(char_image_np, char_shape[0], char_shape[1]) image_list.append(char_image_np) if image_list: tips_image_np = np.concatenate(image_list, axis=1) # 加干扰 tips_image_np = create_noise(tips_image_np) # 切割 image_list = [] for i in range(tip_char_num): image_list.append(tips_image_np[:, i*char_shape[1]:(i+1)*char_shape[1], :]) return image_np, position_list, image_list def get_char_image(char, char_shape, rotate=True, bg_color=(0, 0, 0, 0)): # 创建空图 image_pil = Image.new('RGBA', (80, 80), bg_color) # 空图上写字 # font_size = 35 # (40, 40) font_size = 75 # (80, 80) font_type_list = glob("../font/*") font_type = random.sample(font_type_list, 1)[0] font_config = ImageFont.truetype(font_type, int(font_size)) dr = ImageDraw.Draw(image_pil) fill_color = random_color() fill_color = (fill_color[0], fill_color[1], fill_color[2]) dr.text((3, -6), char, font=font_config, fill=fill_color) if rotate: if random.choice([0, 1]): angle = random.randint(0, 80) else: angle = random.randint(280, 360) image_pil = image_pil.rotate(angle, expand=False, fillcolor=bg_color) # image_pil.show("1") image_pil = image_pil.resize(char_shape) # cv2.imshow("get_char_image", pil2np(image_pil)) # cv2.waitKey(0) return image_pil def get_char_image2(char, char_shape, rotate=True, char_color=None, is_tips=False): # plt.ion() # plt.ioff() font_files = ['simhei', 'simsun', "Microsoft YaHei"] # 修改matplotlib中font_manager.py中defaultFamily plt.rcParams["font.sans-serif"] = ["Microsoft YaHei"] font_colors = [ 'aqua', 'aquamarine', 'bisque', 'black', 'blue', 'blueviolet', 'brown', 'burlywood', 'cadetblue', 'chartreuse', 'chocolate', 'coral', 'cornflowerblue', 'cornsilk', 'crimson', 'cyan', 'darkblue', 'darkcyan', 'darkgoldenrod', 'darkgray', 'darkgreen', 'darkmagenta', 'darkolivegreen', 'darkorange', 'darkorchid', 'darkred', 'darksalmon', 'darkseagreen', 'darkslategray', 'darkturquoise', 'darkviolet', 'deeppink', 'deepskyblue', 'dimgray', 'dodgerblue', 'firebrick', 'forestgreen', 'fuchsia', 'gold', 'goldenrod', 'gray', 'green', 'greenyellow', 'hotpink', 'indianred', 'indigo', 'lawngreen', 'lightseagreen', 'lightskyblue', 'lightslategray', 'lightsteelblue', 'lime', 'limegreen', 'magenta', 'maroon', 'mediumaquamarine', 'mediumblue', 'mediumorchid', 'mediumpurple', 'mediumseagreen', 'mediumslateblue', 'mediumspringgreen', 'mediumturquoise', 'mediumvioletred', 'midnightblue', 'navy', 'olive', 'olivedrab', 'orange', 'orangered', 'orchid', 'palegoldenrod', 'palegreen', 'paleturquoise', 'palevioletred', 'peachpuff', 'peru', 'pink', 'plum', 'purple', 'red', 'rosybrown', 'royalblue', 'saddlebrown', 'salmon', 'sandybrown', 'seagreen', 'sienna', 'silver', 'skyblue', 'slateblue', 'slategray', 'springgreen', 'steelblue', 'tan', 'teal', 'tomato', 'turquoise', 'violet', 'yellow', 'yellowgreen' ] line_width = random.choice([random.randint(1, 6), 0, 0]) # print("char_color", char_color) if char_color == (0, 0, 0): font_color = 'black' line_color = 'black' font_weight = random.randint(1, 300) else: if random.choice([1, 1, 1]) or is_tips: font_weight = random.randint(200, 500) font_color = random.sample(font_colors, 1)[0] line_color = random.sample(font_colors, 1)[0] else: font_color = 'black' line_color = 'black' font_weight = random.randint(40, 100) if rotate: rotation = random.choice([random.randint(0, 70), random.randint(290, 360), 0]) else: rotation = 0 # print("font_color", font_color, char) # 写字 plt.rcParams['font.weight'] = font_weight text = fig.text(0.5, 0.5, char, ha="center", va="center", size=55, color=font_color, rotation=rotation) # 字体有边框 text.set_path_effects([path_effects.Stroke(linewidth=line_width, foreground=line_color), path_effects.Normal()]) # 更新图 # plt plt.pause(0.0000001) # fig = plt.gcf() # plt.plot() # plt.show() # plt.legend() # 申请缓存 _buffer = io.BytesIO() fig.savefig(_buffer, format='png') _buffer.seek(0) image_pil = Image.open(_buffer) plt.cla() plt.clf() # plt.close('all') gc.collect() # 大小限制 image_pil = image_pil.crop((105, 0, image_pil.size[0]-105, image_pil.size[1]-12)) image_pil = image_pil.resize(char_shape) # image_pil.show() return image_pil def get_tips_image(tips_image_list, char_shape, image_shape): new_list = [] for img in tips_image_list: # if random.choice([0, 0, 1]): # angle = random.choice([random.randint(0, 70), random.randint(290, 360)]) # img = pil_rotate(img, angle, (255, 255, 255)) new_list.append(img) tips_image_np = np.concatenate(new_list, axis=1) new_image = np.full((image_shape[0], image_shape[1], 3), 0, np.uint8) new_image[:tips_image_np.shape[0], :tips_image_np.shape[1], :] = tips_image_np position_list = [] for i in range(len(new_list)): h = 0 w = i*char_shape[1] position_list.append([h, w]) if random.choice([0, 1]): new_image[np.all(new_image==0, axis=2)] = 255 return new_image, position_list def get_image_roi(image_bg, image_fg, roi_h, roi_w): # h, w fg_h, fg_w = image_fg.shape[:2] bg_h, bg_w = image_bg.shape[:2] # roi取值范围 roi = image_bg[roi_h:roi_h+fg_h, roi_w:roi_w+fg_w] # 获取bg中非fg字体部分的掩码,相当于排除fg的字体部分,只保留bg的除fg字体外的部分 img_fg_gray = cv2.cvtColor(image_fg, cv2.COLOR_BGR2GRAY) ret, mask = cv2.threshold(img_fg_gray, 0, 255, cv2.THRESH_OTSU) bg_roi = cv2.bitwise_and(roi, roi, mask=mask) # 获取fg中字体部分的掩码,相当于排除fg中的白色背景,只保留fg的字体部分 mask_inv = cv2.bitwise_not(mask) fg_roi = cv2.bitwise_and(image_fg, image_fg, mask=mask_inv) # 膨胀腐蚀去掉白色颗粒 # kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)) # fg_roi = cv2.erode(fg_roi, kernel) # fg_roi = cv2.dilate(fg_roi, kernel) # bg的除字体外背景部分 + fg的字体部分 image_roi = cv2.add(bg_roi, fg_roi) # 将roi部分放回原bg image_bg[roi_h:roi_h+fg_h, roi_w:roi_w+fg_w, :] = image_roi # cv2.imshow("image_fg", image_fg) # cv2.imshow("get_image_roi", image_bg) # cv2.waitKey(0) return image_bg def get_image_paste(image_bg, image_fg, roi_h, roi_w, stretch=False): fg_h, fg_w = image_fg.size[:2] if stretch: if random.choice([1, 1, 1]): threshold = random.randint(7, 13) # print(threshold) if random.choice([0, 1]): image_fg = image_fg.resize((fg_w, fg_h-threshold), Image.BICUBIC) else: image_fg = image_fg.resize((fg_w-threshold, fg_h), Image.BICUBIC) image_bg = cv2.cvtColor(image_bg, cv2.COLOR_BGR2BGRA) image_bg = np2pil_a(image_bg) # image_fg = np2pil_a(image_fg) image_bg.paste(image_fg, (roi_w, roi_h), image_fg) image_bg = pil2np(image_bg) # cv2.imshow("get_image_paste", image_bg) # cv2.waitKey(0) return image_bg def random_color(dims=3): color = [0]*dims find_flag = 0 while not find_flag: for dim in range(dims): color[dim] = random.randint(0, 255) if color[dim] <= 125: find_flag = 1 # RGB # color_list = [ # [207, 91, 85], # [0, 201, 88], # [117, 74, 57], # [210, 210, 27], # [160, 157, 152], # [181, 210, 210], # [27, 112, 107], # [87, 26, 44], # [115, 19, 20], # [161, 210, 68], # [210, 108, 12], # [112, 9, 142], # [50, 41, 84], # [72, 52, 210], # [210, 177, 89], # [148, 200, 89], # [173, 116, 109], # [185, 185, 210], # [181, 7, 210], # [80, 210, 30], # [65, 72, 98], # [210, 123, 109], # [19, 64, 95], # [128, 21, 210], # [129, 137, 60] # ] # color = random.sample(color_list, 1)[0] return tuple(color) def create_noise(image_np): ic = ImageCaptcha() image_pil = np2pil(image_np) if random.choice([0, 1]): draw = ImageDraw.Draw(image_pil) for i in range(0, random.randint(100, 1000)): xy = (random.randrange(0, image_pil.size[0]), random.randrange(0, image_pil.size[1])) r = random.randint(0, 200) fill = (r, r, r) draw.point(xy, fill=fill) del draw for i in range(random.randint(2, 4)): image_pil = ic.create_noise_curve(image_pil, random_color()) image_pil = ic.create_noise_dots(image_pil, random_color(), random.randint(3, 6), random.randint(30, 60)) image_np = pil2np(image_pil) return image_np def create_noise_short_lines(image_np, line_num): image_np = copy.deepcopy(image_np) h, w = image_np.shape[:2] len_range = (5, int(w/2)) for i in range(line_num): p1 = (random.randint(0, w), random.randint(0, h)) find_flag = False while not find_flag: p2 = (random.randint(0, w), random.randint(0, h)) p_len = math.sqrt(math.pow((p1[0]-p2[0]), 2) + math.pow((p1[1]-p2[1]), 2)) if p_len < len_range[0] or p_len > len_range[1]: find_flag = True color = (random.randint(50, 200), random.randint(50, 200), random.randint(50, 200)) thickness = random.randint(1, 1) cv2.line(image_np, p1, p2, color, thickness=thickness) return image_np def create_noise_point(image_np, point_num): image_np = copy.deepcopy(image_np) h, w = image_np.shape[:2] for i in range(point_num): p_center = (random.randint(1, w-2), random.randint(1, h-2)) p_list = [(p_center[0]-1, p_center[1]), (p_center[0]+1, p_center[1]), p_center, (p_center[0], p_center[1]-1), (p_center[0], p_center[1]+1)] p_color = [random.randint(0, 130), random.randint(0, 130), random.randint(0, 130)] for p in p_list: if random.choice([0, 1]): if random.choice([0, 1]): _index = p_color.index(max(p_color)) p_color[_index] = random.randint(200, 255) image_np[p[1], p[0], 0] = p_color[0] image_np[p[1], p[0], 1] = p_color[1] image_np[p[1], p[0], 2] = p_color[2] return image_np def get_iou(x1, y1, x2, y2, a1, b1, a2, b2): # 相交区域左上角横坐标 ax = max(x1, a1) # 相交区域左上角纵坐标 ay = max(y1, b1) # 相交区域右下角横坐标 bx = min(x2, a2) # 相交区域右下角纵坐标 by = min(y2, b2) area_n = (x2 - x1) * (y2 - y1) area_m = (a2 - a1) * (b2 - b1) w = max(0, bx - ax) h = max(0, by - ay) area_x = w * h return area_x / (area_n + area_m - area_x) def preprocess_true_boxes(true_boxes, input_shape, anchors, num_classes): """Preprocess true boxes to training input format Parameters ---------- true_boxes: array, shape=(m, T, 5) Absolute x_min, y_min, x_max, y_max, class_id relative to input_shape. input_shape: array-like, hw, multiples of 32 anchors: array, shape=(N, 2), wh num_classes: integer Returns ------- y_true: list of array, shape like yolo_outputs, xywh are reletive value """ # print(true_boxes.shape) # print(true_boxes[..., 4]) # print(num_classes) try: sss = (true_boxes[..., 4] < num_classes).all() except: print(true_boxes[..., 4]) assert (true_boxes[..., 4] < num_classes).all(), 'class id must be less than num_classes' # default setting num_layers = len(anchors)//3 anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]] true_boxes = np.array(true_boxes, dtype='float32') input_shape = np.array(input_shape, dtype='int32') boxes_xy = (true_boxes[..., 0:2] + true_boxes[..., 2:4]) // 2 boxes_wh = true_boxes[..., 2:4] - true_boxes[..., 0:2] true_boxes[..., 0:2] = boxes_xy/input_shape[::-1] true_boxes[..., 2:4] = boxes_wh/input_shape[::-1] m = true_boxes.shape[0] grid_shapes = [input_shape//{0: 32, 1: 16, 2: 8}[l] for l in range(num_layers)] y_true = [np.zeros((m, grid_shapes[l][0], grid_shapes[l][1],len(anchor_mask[l]), 5+num_classes), dtype='float32') for l in range(num_layers)] # Expand dim to apply broadcasting. anchors = np.expand_dims(anchors, 0) anchor_maxes = anchors / 2. anchor_mins = -anchor_maxes valid_mask = boxes_wh[..., 0] > 0 for b in range(m): # Discard zero rows. wh = boxes_wh[b, valid_mask[b]] if len(wh) == 0: continue # Expand dim to apply broadcasting. wh = np.expand_dims(wh, -2) box_maxes = wh / 2. box_mins = -box_maxes intersect_mins = np.maximum(box_mins, anchor_mins) intersect_maxes = np.minimum(box_maxes, anchor_maxes) intersect_wh = np.maximum(intersect_maxes - intersect_mins, 0.) intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1] box_area = wh[..., 0] * wh[..., 1] anchor_area = anchors[..., 0] * anchors[..., 1] iou = intersect_area / (box_area + anchor_area - intersect_area) # Find best anchor for each true box best_anchor = np.argmax(iou, axis=-1) for t, n in enumerate(best_anchor): for l in range(num_layers): if n in anchor_mask[l]: i = np.floor(true_boxes[b,t,0]*grid_shapes[l][1]).astype('int32') j = np.floor(true_boxes[b,t,1]*grid_shapes[l][0]).astype('int32') k = anchor_mask[l].index(n) c = true_boxes[b, t, 4].astype('int32') y_true[l][b, j, i, k, 0:4] = true_boxes[b, t, 0:4] y_true[l][b, j, i, k, 4] = 1 y_true[l][b, j, i, k, 5+c] = 1 return y_true def get_puzzle(shape=(80, 80)): # 创建空图 image_pil = Image.new('RGBA', (shape[1], shape[0]), (255, 255, 255, 0)) draw = ImageDraw.Draw(image_pil) # 居中创建矩形 rec_shape = (40, 40) left_up_point = [int((shape[0]-rec_shape[0])/2), int((shape[1]-rec_shape[1])/2)] right_down_point = [left_up_point[0]+rec_shape[0], left_up_point[1]+rec_shape[1]] # 透明度 flag = random.choice([0, 1]) if flag: alpha = random.randint(100, 150) else: alpha = random.randint(160, 255) # 背景色 if flag: r = random.randint(0, 30) else: r = random.randint(100, 180) # r = random.randint(0, 255) fill_color = (r, r, r, alpha) # 边缘色 if random.choice([0, 1, 1]): if flag: r = random.randint(140, 170) else: r = random.randint(70, 100) outline_color = (r, r, r) else: outline_color = (fill_color[0], fill_color[1], fill_color[2]) draw.rectangle((left_up_point[1], left_up_point[0], left_up_point[1]+rec_shape[1], left_up_point[0]+rec_shape[0]), fill=fill_color, outline=outline_color) # 拼图的圆或半圆 radius = random.randint(int(rec_shape[0] / 3 / 2), int(rec_shape[0] / 3 / 1.2)) center_list = [[left_up_point[1], int((right_down_point[0]+left_up_point[0])/2), 1], [right_down_point[1], int((right_down_point[0]+left_up_point[0])/2), 1], [int((right_down_point[1]+left_up_point[1])/2), left_up_point[0], 0], [int((right_down_point[1]+left_up_point[1])/2), right_down_point[0], 0] ] circle_num = random.randint(1, 4) # print("circle_num", circle_num) center_list = random.sample(center_list, circle_num) min_w, min_h = left_up_point[1], left_up_point[0] max_w, max_h = right_down_point[1], right_down_point[0] for center in center_list: w, h = center[:2] is_width = center[2] # 判断长宽 into_ratio = random.randint(int(1/2*radius), int(3/4*radius)) if is_width: # 挑选圆是凸还是凹进去 if random.choice([0, 1]): center = (center[0]+into_ratio, center[1]) else: center = (center[0]-into_ratio, center[1]) else: if random.choice([0, 1]): center = (center[0], center[1]+into_ratio) else: center = (center[0], center[1]-into_ratio) # 判断透明度 color = fill_color if is_width: if left_up_point[1] <= center[0] <= right_down_point[1]: color = (0, 0, 0, 0) else: if left_up_point[0] <= center[1] <= right_down_point[0]: color = (0, 0, 0, 0) # print("center, color, alpha", center, color, alpha) draw.ellipse([(center[0]-radius, center[1]-radius), (center[0]+radius, center[1]+radius)], fill=color, outline=outline_color) # 修补内部圆的边缘颜色 if color[3] == alpha: if is_width: if center[0] < w: draw.rectangle((w, h-radius, center[0]+radius, center[1]+radius), fill=fill_color) else: draw.rectangle((center[0]-radius, center[1]-radius, w, h+radius), fill=fill_color) else: if center[1] < h: draw.rectangle((w-radius, h, center[0]+radius, center[1]+radius), fill=fill_color) else: draw.rectangle((center[0]-radius, center[1]-radius, w+radius, h), fill=fill_color) # 修补外部圆的边缘颜色 else: if is_width: if center[0] > w: draw.rectangle((center[0]-radius, center[1]-radius, w, h+radius), fill=(0, 0, 0, 0)) else: draw.rectangle((w, h-radius, center[0]+radius, center[1]+radius), fill=(0, 0, 0, 0)) else: if center[1] > h: draw.rectangle((center[0]-radius, center[1]-radius, w+radius, h), fill=(0, 0, 0, 0)) else: draw.rectangle((w-radius, h, center[0]+radius, center[1]+radius), fill=(0, 0, 0, 0)) # 新增面积 if color[3] == alpha: if center[0]-radius <= min_w: min_w = center[0]-radius if center[0]+radius >= max_w: max_w = center[0]+radius if center[1]-radius <= min_h: min_h = center[1]-radius if center[1]+radius >= max_h: max_h = center[1]+radius image_pil = image_pil.crop([min_w, min_h, max_w+1, max_h+1]) # image_pil.show("2") return image_pil def puzzle_on_image(image_np, puzzle_shape, image_shape): position_list = [] # 获取拼图图片 puzzle_image_pil = get_puzzle() puzzle_image_pil = puzzle_image_pil.resize(puzzle_shape) image_np = pil_resize(image_np, image_shape[0], image_shape[1]) # h, w fg_w, fg_h = puzzle_image_pil.size[:2] bg_h, bg_w = image_np.shape[:2] # 拼图放置的位置 position_h = random.randint(0, bg_h-fg_h) position_w = random.randint(0, bg_w-fg_w) position_list.append([position_h, position_w]) # for p in position_list: # cv2.rectangle(image_np, (p[1], p[0]), # (p[1]+puzzle_shape[1], p[0]+puzzle_shape[0]), # (0, 0, 255), 1) # 拼图添加到背景图上 image_np = get_image_paste(image_np, puzzle_image_pil, position_h, position_w) # cv2.imshow("puzzle_on_image", image_np) # cv2.waitKey(0) return image_np, position_list def get_drag_image(image_np): h, w = image_np.shape[:2] # 取一定高度图片 clip_h = random.randint(int(1/4*h), int(3/4*h)) image_clip = image_np[:clip_h, ...] # 将图片在一定宽度截断,重新拼接 clip_w = random.randint(int(1/6*w), int(5/6*w)) image_w1 = image_clip[:, :clip_w, ...] image_w2 = image_clip[:, clip_w:, ...] image_new = np.concatenate([image_w2, image_w1], axis=1) # 分割线 clip_line = [(image_w2.shape[1], 0), (image_w2.shape[1], clip_h)] # show # print(clip_line) # cv2.line(image_new, clip_line[0], clip_line[1], (0, 0, 255), 2) # cv2.imshow("get_drag_image", image_new) # cv2.waitKey(0) return image_new, clip_line def get_real_data_puzzle(shape=(160, 256)): paths = glob("../data/detect2_real/*") i = 10000 for p in paths: image = cv2.imread(p) image = pil_resize(image, shape[0], shape[1]) cv2.imwrite("../data/detect2_real/"+str(i)+".jpg", image) i += 1 image = image_enhance_distort(image) cv2.imwrite("../data/detect2_real/"+str(i)+".jpg", image) i += 1 image = image_enhance_flip(image) cv2.imwrite("../data/detect2_real/"+str(i)+".jpg", image) i += 1 def read_label_puzzle(): paths = glob("../data/detect2_real/*.json") map_path = "../data/detect2_real/map.txt" with open(map_path, "a") as f: for p in paths: with open(p, "r") as fp: _dict = json.loads(fp.read()) points = _dict.get("shapes")[0].get("points") image_path = _dict.get("imagePath") ps = [str(int(points[0][0])), str(int(points[0][1])), str(int(points[1][0])), str(int(points[1][1]))] p_str = ",".join(ps) f.write(image_path + " " + p_str + ",0" + "\n") def fix_map_txt(): path = "../data/map.txt" with open(path, "r") as f: _list = f.readlines() with open("../data/map_new.txt", "w") as f: new_list = [] for line in _list: ss = line.split(" ") ps = ss[-1][:-1].split(",")[:-1] if random.choice([0, 1, 1, 1]): pix = random.choice([1, 2, 2, 3, 3, 4, 4]) for i in range(len(ps)): if i < 2: ps[i] = str(int(ps[i]) - pix) else: ps[i] = str(int(ps[i]) + pix) new_line = ss[0] + " " + ",".join(ps) + ",0\n" new_list.append(new_line) print("line", line) print("new_line", new_line) f.writelines(new_list) def get_char_map(): path = "../data/phrase/phrase3.txt" with open(path, "r") as f: _list = f.readlines() path = "../data/phrase/phrase4.txt" with open(path, "r") as f: _list += f.readlines() path = "../data/phrase/phrase5.txt" with open(path, "r") as f: _list += f.readlines() _str = "".join(_list) _str = re.sub("\n", "", _str) _list = list(set([x+"\n" for x in _str])) _list.sort(key=lambda x: x) with open("../data/phrase/char.txt", "w") as f: f.writelines(_list) def resize_base_image(): paths = glob("../data/base/*") for p in paths: _img = cv2.imread(p) h, w = _img.shape[:2] if h > 600 and h > w: best_h = 600 best_w = int(best_h * w / h) _img = pil_resize(_img, best_h, best_w) cv2.imwrite(p, _img) elif w > 600 and w > h: best_w = 600 best_h = int(best_w * h / w) _img = pil_resize(_img, best_h, best_w) cv2.imwrite(p, _img) def image_enhance_distort(image_np, hue=.3, sat=3., val=3.): """ 数据增强:色彩扰动 :return: """ def rand(a=0, b=1): return np.random.rand()*(b-a) + a # cv2.imshow("distort_image1", image_np) hue = rand(-hue, 0) sat = rand(1, sat) if rand() < .2 else 1/rand(1, sat) val = rand(1, val) if rand() < .2 else 1/rand(1, val) # print(hue, sat, val) image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB) x = rgb_to_hsv(image_np/255.) x[..., 0] += hue x[..., 0][x[..., 0] > 1] -= 1 x[..., 0][x[..., 0] < 0] += 1 x[..., 1] *= sat x[..., 2] *= val x[x > 1] = 1 x[x < 0] = 0 image_np = hsv_to_rgb(x) image_np = cv2.cvtColor(np.uint8(image_np*255), cv2.COLOR_RGB2BGR) # cv2.imshow("distort_image2", image_np) # cv2.waitKey(0) return image_np def image_enhance_flip(image_np): """ 数据增强:翻转 :param image_np: :return: """ # cv2.imshow("flip_image1", image_np) if random.choice([0, 1]): # 水平翻转 image_np = cv2.flip(image_np, 1) else: # 垂直翻转 image_np = cv2.flip(image_np, 0) # cv2.imshow("flip_image2", image_np) # cv2.waitKey(0) return image_np def image_enhance_blur(image_pil, blur_mode=2): """ 数据增强:模糊 :param image_pil: :param blur_mode: 1:标准模糊 2:高斯模糊 3:均值模糊 :return: """ # image_pil.show() if blur_mode == 1: image_pil = image_pil.filter(ImageFilter.BLUR) elif blur_mode == 2: image_pil = image_pil.filter(ImageFilter.GaussianBlur(1)) elif blur_mode == 3: image_pil = image_pil.filter(ImageFilter.BoxBlur(10)) # image_pil.show() return image_pil def image_enhance_color(image_pil, _range=(.5, 6.)): """ 数据增强:色彩饱和度 :param image_pil: :param _range: :return: """ # image_pil.show() enh_col = ImageEnhance.Color(image_pil) color = random.uniform(_range[0], _range[1]) # print("color", color) image_colored = enh_col.enhance(color) # image_colored.show() return image_colored def image_enhance_brightness(image_pil, _range=(.3, 2.)): """ 数据增强:亮度 :param image_pil: :param _range: :return: """ enh_bri = ImageEnhance.Brightness(image_pil) brightness = random.uniform(_range[0], _range[1]) # print("brightness", brightness) image_brightened = enh_bri.enhance(brightness) return image_brightened def image_enhance_contrast(image_pil, _range=(.3, 6.)): """ 数据增强:对比度 :param image_pil: :param _range: :return: """ # image_pil.show() enh_con = ImageEnhance.Contrast(image_pil) contrast = random.uniform(_range[0], _range[1]) # print("contrast", contrast) image_contrasted = enh_con.enhance(contrast) # image_contrasted.show() return image_contrasted def image_enhance_sharpness(image_pil, _range=(4., 8.)): """ 数据增强:锐度 :param image_pil: :param _range: :return: """ # image_pil.show() enh_sha = ImageEnhance.Sharpness(image_pil) sharpness = random.uniform(_range[0], _range[1]) # print("sharpness", sharpness) image_sharped = enh_sha.enhance(sharpness) # image_sharped.show() return image_sharped def image_enhance_wrap(image_pil, dx_factor=0.3, dy_factor=0.3): """图像扭曲""" width, height = image_pil.size dx = width * dx_factor dy = height * dy_factor x1 = int(random.uniform(-dx, dx)) y1 = int(random.uniform(-dy, dy)) x2 = int(random.uniform(-dx, dx)) y2 = int(random.uniform(-dy, dy)) warp_image = Image.new('RGB', (width + abs(x1) + abs(x2), height + abs(y1) + abs(y2))) warp_image.paste(image_pil, (abs(x1), abs(y1))) width2, height2 = warp_image.size warp_image = warp_image.transform((width, height), Image.QUAD, (x1, y1, -x1, height2 - y2, width2 + x2, height2 + y2, width2 - x2, -y1)) warp_image.show() return warp_image def eight_neighbour(image_np, k=4): """ 8邻域降噪 :return: """ def calculate_noise_count(img_obj, w, h): """ 计算邻域非白色的个数 """ count = 0 width, height = img_obj.shape for _w_ in [w - 1, w, w + 1]: for _h_ in [h - 1, h, h + 1]: if _w_ > width - 1: continue if _h_ > height - 1: continue if _w_ == w and _h_ == h: continue if img_obj[_w_, _h_] < 230: # 二值化的图片设置为255 count += 1 return count w, h = image_np.shape for _w in range(w): for _h in range(h): if _w == 0 or _h == 0: image_np[_w, _h] = 255 continue # 计算邻域pixel值小于255的个数 pixel = image_np[_w, _h] if pixel == 255: continue if calculate_noise_count(image_np, _w, _h) < k: image_np[_w, _h] = 255 return image_np def connected_component(image_np): # 黑白转化 image_np = 255 - image_np # 二值化 ret, image_np = cv2.threshold(image_np, 60, 255, cv2.THRESH_BINARY) # 膨胀 kernel = np.ones((2, 2), np.uint8) image_np = cv2.dilate(image_np, kernel, iterations=1) kernel = np.ones((2, 2), np.uint8) image_np = cv2.erode(image_np, kernel, iterations=1) return image_np w, h = binary_img.shape color = [] color.append((0, 0, 0)) img_color = np.zeros((w, h, 3), dtype=np.uint8) retval, labels, stats, centroids = cv2.connectedComponentsWithStats(binary_img) for num in range(1, retval): color_b = random.randint(0, 255) color_g = random.randint(0, 255) color_r = random.randint(0, 255) color.append((color_b, color_g, color_r)) for x in range(w): for y in range(h): lable = labels[x, y] img_color[x, y, :] = color[int(lable)] # cv2.imshow("img_color", img_color) # # cv2.waitKey(0) return img_color def add_contrast(image_np): # cv2.imshow("image_np", image_np) img = image_np.astype(np.float32) bri_mean = np.mean(img) # a = np.arange(5, 16, 5) / 10 # b = np.arange(-30, 31, 30) # # a_len = len(a) # b_len = len(b) # print(a_len, b_len) # # for i in range(a_len): # for j in range(b_len): # aa = a[i] # bb = b[j] # img_a = aa * (img-bri_mean) + bb + bri_mean # print(i, j, aa, bb) # img_a = np.clip(img_a, 0, 255).astype(np.uint8) # cv2.imshow("img_a", img_a) # cv2.waitKey(0) aa = 3 bb = -50 img_a = aa * (img-bri_mean) + bb + bri_mean img_a = np.clip(img_a, 0, 255).astype(np.uint8) # cv2.imshow("img_a", img_a) # cv2.waitKey(0) return img_a def get_image_legal(image_np): white = np.sum(image_np > 80) black = np.sum(image_np <= 80) if black > white: ratio = white / black else: ratio = black / white return ratio if __name__ == "__main__": generate_data_char() # for _p in glob("../data/test/FileInfo1021/*"): # add_contrast(cv2.imread(_p)) # print(random.uniform(0.5, 0.5)) # paths = glob('../data/equation/*') # for p in paths: # img = cv2.imread(p) # if img.shape[0] == 69 and img.shape[1] == 330: # print(p) # os.remove(p) # image_enhance_wrap(Image.open("D:/Project/captcha/data/equation/30_4_-_零.jpg"))