Dynamic Data Grid Replication And Data Consistency

This paper is based on problems of replication in grid computing. The study mainly focuses on replication strategy in data grid environment. I have proposed dynamic services for replicating and maintaining data in grid environments, and directing replicas to appropriate locations for use. To address a problem with the Bandwidth Hierarchy-based Replication (BHR) algorithm, a strategy for maintaining replicas dynamically, I have propose the Dynamic Maintenance Service (DMS). I anticipate striking a balance between improving data access performance and replica consistency in data grids. This work can find out the more efficient ways of utilizing storage space is an important point.


1.     Introduction: A grid is large scale resource sharing and problem solving mechanism in virtual organizations [1]. Large number of computational and storage resources are linked globally to form a grid. In some scientific application areas such as high energy physics, bioinformatics, and earth observations, we encounter huge amounts of data. People expect the size of data terabyte or even petabyte scale in some applications [2].  Managing such huge amounts of data in a centralized manner is almost impossible due to extensively increased data access time. Data replication is a key technique to manage large data in a distributed manner. By its nature, we can achieve better performance (access time) by replicating data in geographically distributed data stores. In data grid, user’s jobs require access to large number of files. If the required files are replicated in some site in which the job is executed, job is able to process data without any communication delay. However, if required files are not in the site, they should be fetched from other sites and it usually takes very long time because size of single replica may reach gigabyte scale in some applications and network bandwidth between sites is limited. As a result, job execution time becomes very long due to delay of fetching replicas over Internet. Dynamic replication is an optimization technique which aims to maximize chances of data locality. In other words, dynamic replica optimizer running in a site tries to locate files which are likely to be requested in the near future. As the number of file hit ratio increases, job execution time reduces significantly. Various dynamic replication strategies have been introduced so far [3, 4, and 5]. In this paper, we propose novel dynamic replication strategy; called BHR (Band- width Hierarchy based Replication). The existing replication strategies try to maximize locality of file to reduce data access time.

However, grid sites may be able to hold only small portion of overall amount of data since very large quantity of data is produced in data grid and the storage space in a site is limited. Therefore, effect from this locality is limited to a certain degree. BHR strategy takes benefit from other form of locality, called network-level locality. Although the required file is not in the same site performing job, there will be not long delay of fetching replica if the replica is located in the site having broad bandwidth to the site of job execution. We call this condition as network-level locality. In data grid, some sites may be located within a region where sites are linked closely. For instance, a country can be referred to as this network region. Network bandwidth between sites within a region will be broader than bandwidth between sites across regions. Thus, hierarchy of network bandwidth may appear in Internet. If the required file is located in the same region, less time will be consumed to fetch the file. In other words, benefit of net- work-level locality can be exploited.  BHR strategy reduces data access time by maximizing this network-level locality.

2.    Related Works: Dynamic replication is a long-term optimization technique which aims at reducing average job execution time in data grid. Since very large quantity of data files are deployed in data grid, there will be certain limitation of amount of files which can be stored at each site. If SE (Storage Element) at a grid site is already filled up with replicas, some of them should be deleted in order to store newly requested data. Kavitha Ranganathan et al. present various traditional replication and caching strategies and evaluate them from the perspective of data grid in [5]. They measure access latency and bandwidth consumptions of each strategy with simulation tool and their simulation results show that Cascading and Fast Spread perform best among traditional strategies. Economy based replication is proposed in [3, 4]. In economic approach, a kind of auction protocol is used to select the best replica for a job and to trigger long-term optimization (dynamic optimization) by using file access patterns. The authors show the improvement compared to traditional replication techniques by performing simulation with OptorSim. OptorSim is a data grid simulation tool developed as part of European Data Grid Project [6]. General data grid scenarios are modeled in OptorSim and one can evaluate various replication strategies implemented in it. The existing replication techniques mentioned above are based on file access pat- tern at each site. If a grid site requests some files more frequently than others, it is better for the site to hold these files for near future usage. Even though this site-level locality can reduce data access time to some extent, there remains limitation. Performance gain from site-level locality can make sense when grid sites have enough space to store large portion of data and certain predictable file access patterns come out. However, we cannot assure in many cases that a single grid site will have enough space to store large portion of whole data and there will be predictable file access patterns. We find another key to the performance improvement by broadening our view of locality to the network level.

3.     Dynamic Replication Strategy based on Bandwidth Hierarchy: In this section, we propose novel dynamic replication strategy, called BHR, which is based on bandwidth hierarchy of Internet. Our BHR strategy takes benefit from network-level locality of files. The idea of proposed strategy is motivated from the assumption that hierarchy of bandwidth appears in Internet. Figure 1 shows this assumption.
 
We assume that group of sites are located on the same network region. A network region is a network topological space where sites are located closely. This network region can be seen as an Internet topology within a country. It is generally known that lower bandwidth can be allocated for network link between sites across countries than link between sites within a country. In many cases, network region may be usually correspondent to geographical space like a country or a continent. If the required replica is in the same region, the job is able to fetch the replica easily since broader bandwidth can be provided within a region. In contrast, if the required replica is located at the site in other region, much time will be consumed to fetch this replica via many links including highly congested one. Thus, a form of locality emerges which we call network-level locality. Main purpose of BHR strategy is to maximize this network-level locality within job execution model in data grid. BHR tries to replicate files which are likely to be used frequently within the region in near future. BHR optimizer runs both on a region and on a site cooperating with each other. Figure 2 describes detail of BHR algorithm.

BHR Optimizer (region):
Keep track of names of stored files and access frequency of each file within the region;
BHR Optimizer (site):
1.     if (needed replica not in a site)
Fetch replica from other site;

2.     Proceed to execute the job with the replica;

3.     if (free space in SE to store new replica) Store it;
else {

4.     if (new replica is duplicated in other sites within region) { Terminate optimizer;    // avoid duplication
else {
Sort files in SE in the order of less frequently accessed;
for (each file in SE) {
if (file is duplicated in other sites within region)
delete it;
if (enough free space to store new replica)
break;
} }
5.     if (!enough free space) {
Sort files in SE in the order of less frequently accessed;
// it’s based on the access history gathered by region-optimizer
for (each file in sorted list) {
if (access frequency of new replica > access frequency of the file)
delete file;
if (enough free space)
break;
} }
if (enough free space) Store new replica;}

Figure 2. BHR replication algorithm

The access frequency gathered by region-optimizer means number of file requests made by jobs run on the sites within a region. It reflects regional popularity of files. If the job fetches a file from other sites and the SE is already filled up with replicas, we should determine whether storing newly received file is beneficial. If it turns out to be profitable, then we choose a file that should be deleted in order to store new replica. We apply 2-step decision process. First one is avoiding duplication. The procedure 4 in Figure 2 locates variety of replicas as many as possible in the region without duplication. Secondly, we take account of popularity of files as represented by procedure 5. In data grid, there can be popularity of file accesses, that is, certain files will be requested more frequently than others by grid job. While the previous strategies consider popularity of files at the site level, we focus on access popularity at the region level. BHR replaces unpopular files from the regional point of view. By applying above two steps, chance of hitting network-level locality can be maximized.

4.     Objective:
1)    To study replica management in grid computing environment.
2)    To find solution for the problem of replication.
3)    To find proper algorithms & protocols for managing replica in data grid.

5.     Conclusion and Future Works: In this paper, we propose novel dynamic replica optimization strategy which is based on the network-level locality. BHR tries to replicate popular files as many as possible within a region, where broad bandwidth is provided between sites. The simulation results show that BHR takes less job execution time than other strategies especially when grid sites have relatively small size of storage and hierarchy of bandwidth clearly appears. BHR extends current site-level replica optimization study to more scalable way by exploiting network-level locality.In our future work, we plan to survey actual Internet topology for data grid and study on how we group the grid sites as a region. Also, we will collect the experimental data such as data access patterns from real data grid applications and apply it to our BHR strategy to verify its performance in practical applications.


   
6. References:
1.    Foster, C. Kesselman and S. Tuecke. (2001): The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International J. Supercomputer Applications, 15(3).
2.    Wolfgang  Hoschek,  Javier  Jaen-Martinez,  Asad  Samar,  Heinz  Stockinger  and  Kurt Stockinger, (2000): Data Management in an International Data Grid Project. 1st IEEE/ACM In- ternational Workshop on Grid Computing (Grid'2000), Bangalore, India, Dec.
3.    William H. Bell, David G. Cameron, Ruben Carvajal-Schiaffino, A. Paul  Millar, Kurt Stockinger, and Floriano Zini. (2003) Evaluation of an Economy-Based File Replication Strat- egy for a Data Grid. In International Workshop on Agent based Cluster and Grid Com- puting at CCGrid 2003, Tokyo, Japan, May 2003. IEEE Computer Society Press.
4.    Mark Carman, Floriano Zini, Luciano Serafini, and Kurt Stockinger, (2002): Towards an Econ- omy-Based Optimisation of File Access and Replication on a Data Grid. In International Workshop on Agent based Cluster and Grid Computing at  International Symposium on Cluster  Computing and the Grid (CCGrid'2002),  Berlin,  Germany,  May 2002.  IEEE Computer Society Press.
5.    Kavitha Ranganathan and Ian Foster, (2001) “Identifying Dynamic Replication Strategies for a High Performance Data Grid”. International Workshop on Grid Computing, Denver, November.
6.    OptorSim    –    A   Replica    Optimizer    Simulation:    http://edg-wp2.web.cern.ch/edg- wp2/optimization/optorsim.html