排序方式: 共有11条查询结果,搜索用时 15 毫秒
1.
Pouria Pirzadeh Junichi Tatemura Oliver Po Hakan Hac?gümü? 《Journal of Grid Computing》2012,10(1):109-132
Recently there has been a considerable increase in the number of different Key-Value stores, for supporting data storage and
applications on the cloud environment. While all these solutions try to offer highly available and scalable services on the
cloud, they are significantly different with each other in terms of the architecture and types of the applications, they try
to support. Considering three widely-used such systems: Cassandra, HBase and Voldemort; in this paper we compare them in terms
of their support for different types of query workloads. We are mainly focused on the range queries. Unlike HBase and Cassandra
that have built-in support for range queries, Voldemort does not support this type of queries via its available API. For this
matter, practical techniques are presented on top of Voldemort to support range queries. Our performance evaluation is based
on mixed query workloads, in the sense that they contain a combination of short and long range queries, beside other types
of typical queries on key-value stores such as lookup and update. We show that there are trade-offs in the performance of
the selected system and scheme, and the types of the query workloads that can be processed efficiently. 相似文献
2.
3.
Ning?ZhangEmail author Junichi?Tatemura Jignesh?M.?Patel Hakan?Hacigumus 《The VLDB Journal The International Journal on Very Large Data Bases》2014,23(2):329-354
Data center operators face a bewildering set of choices when considering how to provision resources on machines with complex I/O subsystems. Modern I/O subsystems often have a rich mix of fast, high performing, but expensive SSDs sitting alongside with cheaper but relatively slower (for random accesses) traditional hard disk drives. The data center operators need to determine how to provision the I/O resources for specific workloads so as to abide by existing service level agreements, while minimizing the total operating cost (TOC) of running the workload, where the TOC includes the amortized hardware costs and the run-time energy costs. The focus of this paper is on introducing this new problem of TOC-based storage allocation, cast in a framework that is compatible with traditional DBMS query optimization and query processing architecture. We also present a heuristic-based solution to this problem, called DOT. We have implemented DOT in PostgreSQL, and experiments using TPC-H and TPC-C demonstrate significant TOC reduction by DOT in various settings. 相似文献
4.
Jong Wook Kim K. Selçuk Candan Junichi Tatemura 《Journal of Signal Processing Systems》2010,58(3):407-421
As their popularity as dynamic platforms for information dissemination and sharing increases, the use of Weblogs (blogs) which
track and comment on real world (political, news, entertainment) events is also growing. The success of the blog as a popular
medium for information sharing, on the other hand, is also its weakest spot in that there is little support beyond keyword
based searches for blog entries. Consequently, there is impending need for navigational support, which can help users relate
a large, diverse, and inherently distributed collection of blogosphere. In this paper, we first note that the existence of
large degrees of content overlaps in the form of quotation/commentary pairs (as well as content borrowings across media outlets)
can be leveraged for tracking the topic development patterns within the blogosphere. Relying on this observation, we first
propose focus and flow analysis techniques that rely on reuse detection and focus and flow to help place blog entries into
logical organizations. We then show that these implicit or explicit quotations as well as focus analysis could be leveraged
to identify the contexts in which entries occur; thus, resulting in more effective tagging. Thus, we propose CDIP (a collection-driven,
yet individuality-preserving tagging system) which relies on relationships provided by quotation/reuse detection and semantic-focus
analysis to automatically tag the blogs in such a way that, not-only the related blogs share tags, but also individuality
of the entries is preserved for discriminating tag-based accesses. 相似文献
5.
6.
Chen Songting Li Hua-Gang Tatemura Jun'ichi Hsiung Wang-Pin Agrawal Divyakant Candan K. Sel uk 《Knowledge and Data Engineering, IEEE Transactions on》2008,20(12):1627-1640
An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex Generalized-Tree-Pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can 1) achieve polynomial time and space complexity for post processing, 2) avoid redundant predicate evaluations, 3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and 4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries. 相似文献
7.
研究两面针的红外指纹图谱与抗肿瘤活性之间的相互关系.基于不同产地的两面针氯仿提取物的红外指纹图谱特征峰强度及其抗肿瘤活性的效果,采用后退法构建以两面针抑制人胃腺癌7901和人宫颈癌Hela两种肿瘤细胞株的谱效模型.所建立数学模型的预测值与实际测量值的偏差率全部在10%以内,说明两面针红外指纹图谱和两面针抗肿瘤活性之间具有相关性.结果表明,生物碱类成分在两面针抗肿瘤药效活性中发挥了重要的作用. 相似文献
8.
针对含不同置信级样本的模型拟合问题,该文提出了一种基于神经网络的二次学习方法。文中指出真实模型是实验模型的一种变异,提出逼近真实模型期望值的神经网络,是融合先验样本和真实样本的最佳网络。首先,以先验样本为训练样本进行第1次神经网络学习,并计算取决于硬点信息的软点误差容量区间;然后,同时将先验样本和真实样本作为训练样本,利用软点误差容量区间和硬点误差敏感系数,对神经网络训练过程中输入/目标对的误差进行修改,通过第2次学习得到既能精确拟合真实样本,又能最大化利用先验样本信息的综合网络。与基于知识的神经网络(KBNN)相比,该方法更加简单,可操控性更强并具有更加明确的逻辑意义。 相似文献
9.
Yaginuma Y. Yatabe T. Satou T. Tatemura J. Sakauchi M. 《Multimedia Tools and Applications》1997,5(1):65-77
Multimedia database systems have become more and more important as the tool to extract and generate additional values from multimedia Contents. In this paper, four multimedia database systems are proposed from the view point of promising contents sources; the Network multimedia databases, the Stream MM database systems, the Library MM database systems, and the Real world MM database systems. Important problems to be solved, i.e., what to do, are also discussed for each databases. Three concrete multimedia systems by authors' research group, are then introduced and discussed as the embodiments of these multimedia systems; (1) the open Global Image Retrieval and Linking System, GIRLS, for mediation WWW data pace as the network MM database systems, (2) the flexible multimedia database platform GOLS, and (3) the higher level authoring system for the Stream MM environments. 相似文献
10.