首页> 外文会议>Conference on From Innovation to Impact >Devising a Distinguishable Feature Set for Sinhala and English Script Separation on Social Media Images
【24h】

Devising a Distinguishable Feature Set for Sinhala and English Script Separation on Social Media Images

机译:在社交媒体图像上设计用于Sinhala和英语脚本分离的可区分功能

获取原文

摘要

In the last decade, social media has become the most popular and ubiquitous facilitator for creating, and sharing content, accessing information, and connecting people around the world. Nowadays, social media content generation is rapidly growing in the Sri Lankan community. Some content generation was a threat to peace, which led to unpleasant situations in the country within the last two years. Content moderation is the best solution for this problem, but manually processing millions of posts is a difficult task. For automation, separation and identification of Sinhala and English character in social media images is an essential project in content identification. In this work, Sinhala and English characters are separately identified from image posts and video thumbnails which are on public Facebook pages. This paper introduces a novel algorithm to separate Sinhala and English characters. It has utilized topological features, contour-based technique, and water reservoir-based technique. The proposed algorithm achieved 93% accuracy. Separated characters are recognized using the Convolutional Neural Network in TensorFlow framework.
机译:在过去十年中,社交媒体已成为创造和共享内容,访问世界各地人民的最受欢迎和无处不在的促进者。如今,斯里兰卡社区的社交媒体内容一代迅速增长。一些内容一代是对和平的威胁,这导致了过去两年内全国的令人不快的情况。内容适度是此问题的最佳解决方案,但手动处理数百万个帖子是一项艰巨的任务。为了自动化,社交媒体图像中僧伽加拉和英语字符的分离和识别是内容识别的重要项目。在这项工作中,Sinhala和英语字符是单独从公共Facebook页面上的图像帖子和视频缩略图中识别的。本文介绍了一种单独的僧伽拉和英语字符的新算法。它利用了基于拓扑特征,基于轮廓的技术和基于水库的技术。所提出的算法精度达到93%。使用TensoRFlow框架中的卷积神经网络识别分离的字符。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利