''' Created on 2019年1月10日 @author: User ''' import psycopg2 import codecs import re import os import pandas as pd from BiddingKG.dl.common.Utils import * def getDatasToExcel(): ''' @summary: 将预标注的数据导出到excel中 ''' list_entity_id = [] list_label = [] list_before = [] list_center = [] list_after = [] list_label_text = [] conn = psycopg2.connect(dbname="BiddingKG",user="postgres",password="postgres",host="192.168.2.101") cursor = conn.cursor() sql = " select A.entity_id,A.label,A.entity_text,A.begin_index,A.end_index,B.tokens,case when A.label=1 then '招标联系人' when A.label=2 then '代理联系人' when A.label=3 then '联系人' else '无' end as link from predict_entity_copy A,predict_sentences_copy B where A.entity_type='person' and A.doc_id=B.doc_id and A.sentence_index=B.sentence_index order by A.label" cursor.execute(sql) rows = cursor.fetchall() for row in rows: tokens = row[5] begin_index = row[3] end_index = row[4] entity_text = row[2] label_text = row[6] list_entity_id.append(row[0]) list_label.append(str(row[1])) beforeafter = spanWindow(tokens,begin_index,end_index,10) list_before.append(beforeafter[0]) list_center.append(entity_text) list_after.append(beforeafter[1]) list_label_text.append(label_text) columns = ["id","label","before","center","after","label_text"] nums = 3 parts = len(list_entity_id)//nums print(parts) i = 0 while(i