首页>
外国专利>
SYSTEM AND METHOD FOR PREDICTING APPLICATION PERFORMANCE FOR LARGE DATA SIZE ON BIG DATA CLUSTER
SYSTEM AND METHOD FOR PREDICTING APPLICATION PERFORMANCE FOR LARGE DATA SIZE ON BIG DATA CLUSTER
展开▼
机译:大数据簇上的大数据量应用性能预测系统和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
A system and method for estimating execution time of an application with Spark platform in a production environment. The application on Spark platform is executed as a sequence of Spark jobs. Each Spark job is executed as a directed acyclic graph (DAG) consisting of stages. Each stage has multiple executors running in parallel and the each executor has set of concurrent tasks. Each executor spawns multiple threads, one for each task. All jobs in the same executor share the same JVM memory. The execution time for each Spark job is predicted as summation of the estimated execution time of all its stages. The execution time constitutes scheduler delay, serialization time, de-serialization time, and JVM overheads. The JVM time estimation depends on type of computation hardware system and number of threads.
展开▼