Improving a Page Classifier with Anchor Extraction and Link Analysis

机译：使用锚提取和链路分析改进页面分类器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most text categorization systems use simple models of documents and document collections. In this paper we describe a technique that improves a simple web page classifier's performance on pages from a new, unseen web site, by exploiting link structure within a site as well as page structure within hub pages. On real-world test cases, this technique significantly and substantially improves the accuracy of a bag-of-words classifier, reducing error rate by about half, on average. The system uses a variant of co-training to exploit unlabeled data from a new site. Pages are labeled using the base classifier; the results are used by a restricted wrapper-learner to propose potential "main-category anchor wrappers"; and finally, these wrappers are used as features by a third learner to find a categorization of the site that implies a simple hub structure, but which also largely agrees with the original bag-of-words classifier.

机译：大多数文本分类系统使用简单的文档和文档集合模型。在本文中，我们描述了一种通过在网站内的链路结构以及集线器页内的页面结构中利用链接结构来改进一个简单的网页分类器的技术。在真实的测试用例中，这种技术显着提高了单词袋式分类的准确性，平均降低了误差率大约一半。该系统使用共同培训的变体来利用新网站利用未标记的数据。使用基本分类器标记页面;该结果由受限制的包装草学习者使用，提出潜在的“主类锚包装”;最后，这些包装器用作第三学习者的特征，以找到暗示一个简单的集线器结构的站点的分类，但这也很大程度上与原始的单词袋式分类器同意。

著录项

来源
《Annual neural information processing systems conference》|2003年||共8页
会议地点
作者
William W. Cohen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Improved methods for urinary atrazine mercapturate analysis - Assessment of an enzyme-linked immunosorbent assay (ELISA) and a novel liquid chromatography-mass spectrometry (LC-MS) method utilizing online solid phase extraction (SPE) [J] . Koivunen ME, Dettmer K, Vermeulen R, Analytica chimica acta . 2006,第2期

机译：尿阿特拉津硫醇盐分析的改进方法-酶联免疫吸附测定（ELISA）和利用在线固相萃取（SPE）的新型液相色谱-质谱（LC-MS）方法的评估
2. Sequential extraction and enrichment of pesticide residues in Longan fruit by ultrasonic-assisted aqueous two-phase extraction linked to vortex-assisted dispersive liquid-liquid microextraction prior to high performance liquid chromatography analysis [J] . Journal of chromatography, A: Including electrophoresis and other separation methods . 2020,第期

机译：超声辅助水性两相萃取在高效液相色谱分析之前与涡旋辅助分散液 - 液微萃取连接龙眼果实两相萃取的龙眼果实两相萃取的顺序提取及富集。
3. Multiple Classifier Selection to Improve Accuracy of Classifier for Time Series Analysis [J] . N.Yamuna Rani, R.Velmani International journal of computer science and network security . 2013,第1期

机译：选择多个分类器以提高分类器时间序列分析的准确性
4. Improving a Page Classifier with Anchor Extraction and Link Analysis [C] . William W. Cohen Annual neural information processing systems conference . 2003

机译：使用锚提取和链路分析改进页面分类器
5. EEG Data Analysis, Feature Extraction and Classifiers. [D] . Zhou, Jing. 2011

机译：脑电数据分析，特征提取和分类器。
6. Analysis of the Fragile X Trinucleotide Repeat in Basques: Association of Premutation and Intermediate Sizes Anchoring AGGs and Linked Microsatellites with Unstable Alleles [O] . M.I Arrieta, J.M Ramírez, M Télez, 2008

机译：在巴斯克语中易碎的X三核苷酸重复的分析：突变和中间大小锚定AGGs和链接的微卫星与不稳定的等位基因的关联。
7. P235Relation between myocardial deformation by three-dimensional speckle tracking analysis , control of the hypertension and functional capacity in patients with systemic hypertensionP236Obstructive sleep apnoea is often under-recognised in patients with acute myocardial infarction, but commonly contributes to left ventricular diastolic dysfunctionP237Intrinsic left ventricular dysfunction in Behcet disease in comparison with systemic disease activity: insights from speckle tracking echocardiographyP238Echocardiography changes during singleton and twin pregnancyP239Proposal of two new echocardiographic parameters for the differentiation of pre-capillary from post-capillary pulmonary hypertensionP240The coincidence of implanted device and number of electrodes with the presence of residual fibrotic tissue after transvenous leads extraction assessed in echocardiographic examP241Hemoglobin is an independent predictor of left ventricular hypertrophy in postmenopausal women but not in premenopausal womenP242Left ventricular longitudinal deformation impairment in patients with psoriasis is linked with immunologic activationP243Impaired diastolic functions and left atrial mechanical functions in patients with vitamin-D deficiencyP244Modification of cardiac and vascular function secondary to insulin resistanceP245Impaired right ventricular function is not related to serum galectin-3 concentration in patients with repaired tetralogy of fallotP246Age-adjusted indices of right ventricular (RV) longitudinal function do not adequately reflect the global RV contraction in children with repaired congenital heart defects and RV volume overloadP247Relationship between fractional flow reserve and dobutamine stress echocardiography in coronary artery diseaseP248A retrospective observational comparative study on the negative predictive value of nuclear testing vs. stress echo in pre-operative assessment for patients undergoing solid organ transplantionP249The detection of viable myocardium by dobutamine stress speckle tracking echocardiography in patients with coronary artery diseaseP250VO2 flattening during exercise in heart failure: evidence for a combined low cardiac output and inefficient O2 extractionP251 The dynamic assessment of alveolar-capillary barrier during exercise-echocardiography in heart failure patients with reduced ejection fractionP252The effect of exercise training on cardiac function during exercise stress echo in patients with type 2 diabetes and diastolic dysfunctionP253Age-related maximal exercise O2 extraction differences in a population of apparently healthy subjects at cardiovascular riskP254Difference in the changes of functional mitral regurgitation between semisupine ergometer and handgrip exerciseP255Correlation of annulus size assessed by echo 2d, 3d and multidetector computed tomography in patients undergoing transcatheter aortic valve implantationP256Mitral annulus dynamics: from normal to extensive mixomatous disease-a three-dimensional transoesophageal studyP257Left atrial appendage closure: an echo point of view of 55 casesP258Effect of catheter-based renal denervation on left ventricular function, mass and (un)twist with two-dimensional speckle tracking echocardiographyP259Associations of microRNAs gene expression in peripheral blood mononuclear cells and left ventricular global longitudinal peak strain in patients with essential hypertension [O] . I. Craciunescu, N. Ngiam, BJ. Sun, 2016

机译：心肌变形之间P235Relation通过三维斑点追踪分析，在患者全身hypertensionP236Obstructive睡眠呼吸暂停的高血压和功能的能力控制是经常根据公认在急性心肌梗死，但通常有助于左室舒张dysfunctionP237Intrinsic左心室功能紊乱中白塞氏病与全身性疾病的活性的比较：用于预毛细管的从植入装置和电极数目的毛细血管后肺hypertensionP240The巧合的分化与存在的两个新的超声心动图参数单和双pregnancyP239Proposal期间从斑点追踪echocardiographyP238Echocardiography变化的见解静脉引线后残留的纤维组织提取评估了超声心动图examP241Hemoglobin是绝经后妇女的左心室肥厚的独立预测因素，但不是在premen opausal womenP242Left心室纵向变形损害牛皮癣患者与免疫联activationP243Impaired舒张功能和左心房机械功能患者的心脏和血管功能继发于胰岛素的维生素-d deficiencyP244Modification resistanceP245Impaired右心室功能是不相关的血清半乳凝素3浓度患者右心室（RV）的纵向功能fallotP246Age调整指标的修复四部曲没有充分反映在冠状动脉diseaseP248A追溯血流储备分数和多巴酚丁胺超声心动图之间修复先天性心脏缺陷和RV容积overloadP247Relationship全球儿童RV收缩在术前评估核试验对应力回声的阴性预测值进行可行myocard的实体器官transplantionP249The检测患者的观察对比研究鎓通过多巴酚丁胺斑点跟踪超声心动图在患者的心脏衰竭锻炼期间冠状动脉diseaseP250VO2压扁：证据为一个组合的低心输出量和低效率的O2 extractionP251在心脏衰竭患者运动超声心动图中肺泡 - 毛细血管屏障的动态评估具有降低的喷射对心脏功能的运动锻炼的运动负荷超声过程中患者fractionP252The影响2型糖尿病患者和表面健康受试者的群体在心血管riskP254Difference在功能性二尖瓣返流的semisupine测力计和手柄之间的变化舒张dysfunctionP253Age相关的最大运动O2提取差异环大小的exerciseP255Correlation由回波2D评估，3d和多排的患者正在接受经导管主动脉瓣implantationP256Mitral环动力学计算机断层：从正常到广泛mixomatous疾病三dimensi Onal地区食道studyP257Left心耳闭合：视在左心室功能，质量和（UN）捻基于导管的肾去神经支配的55 casesP258Effect的在外周血单核细胞和左侧的微小RNA基因表达的二维散斑跟踪echocardiographyP259Associations回波点心室全球纵向应变峰值在原发性高血压患者

Improving a Page Classifier with Anchor Extraction and Link Analysis

摘要

著录项

相似文献

相关主题

期刊订阅