CN102819664A - Influence maximization parallel accelerating method based on graphic processing unit - Google Patents

Influence maximization parallel accelerating method based on graphic processing unit Download PDF

Info

Publication number
CN102819664A
CN102819664A CN2012102487323A CN201210248732A CN102819664A CN 102819664 A CN102819664 A CN 102819664A CN 2012102487323 A CN2012102487323 A CN 2012102487323A CN 201210248732 A CN201210248732 A CN 201210248732A CN 102819664 A CN102819664 A CN 102819664A
Authority
CN
China
Prior art keywords
node
gpu
nodes
visited
influence value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012102487323A
Other languages
Chinese (zh)
Other versions
CN102819664B (en
Inventor
李姗姗
廖湘科
刘晓东
吴庆波
戴华东
彭绍亮
王蕾
付松龄
鲁晓佩
郑思
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201210248732.3A priority Critical patent/CN102819664B/en
Publication of CN102819664A publication Critical patent/CN102819664A/en
Application granted granted Critical
Publication of CN102819664B publication Critical patent/CN102819664B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses an influence maximization parallel accelerating method based on a graphic processing unit. The purpose of the invention is to provide the influence maximization parallel accelerating method based on the GPU (graphic processing unit). Algorithm implementation is accelerated and the implementation time is shortened by parallel calculating ability of the GPU. The influence maximization parallel accelerating method is characterized by comprising the following steps: in each Monte Carlo simulation, firstly, finding out strong connectivity in a network diagram, merging all nodes in the same strong connectivity into a node, wherein the weight is the sum of the weights of all nodes in the strong connectivity; then calculating an influence value of each node in parallel by a strategy of traversing upwards from the bottom; using different threads by the GPU calculation cores to calculate in a parallel way the influence values of different nodes with the help of the parallel calculation capability of the GPU, and obtaining the K most influential nodes. According to the invention, a pattern is converted into a directed acyclic graph; the calculation quantity of an influence value can be obviously reduced, meanwhile, the overall operation time is shortened by scheduling parallel calculation of each node in the calculation core of the GPU to the maximal extent.

Description

The parallel accelerated method of a kind of having the greatest impact based on GPU
Technical field
The present invention relates to having the greatest impact of community network way to solve the problem in the mass data excavation applications, especially pointer excavates a kind of parallel accelerated method based on GPU GPU of proposition to the mass user of large scale community network.
Background technology
The fast development of Web2.0 technology has promoted the flourish of social medium.All kinds of social network sites continue to bring out, website user's quantity growth such as for example external Facebook, Twitter and domestic everybody net, Sina's microblogging very rapidly, any active ues of current Facebook has surpassed 8.5 hundred million.Social network sites is not only people and is linked up and the bridge that exchanges, the important media that has also become information to propagate and spread simultaneously.Research shows that 68% client can buy the suggestion of inquiring its household, friend before the product.Viral marketing (Viral Marketing) has utilized the principle that public praise is propagated between the user just, carries out network multiple level marketing methods such as brand promotion.And along with the maintaining sustained and rapid growth of community network user, viral marketing has become a kind of ten minutes high-efficiency information circulation way.
Having the greatest impact problem is the classical problem of propagating about influence in the social network analysis.Imagine following scene: a company will carry out new product and promote; It promotes strategy: select K name client free trial new product; Utilize this K name client that the propaganda of product is promoted afterwards and influence propagation attraction more customers purchase new product, thereby reach the optimum purpose of interests.Having the greatest impact problem can formalized description be: for community network figure G=(V, E, W), V={v wherein 0, v 1..., v N-1Be node set, the node number is n among the V; E is the set of the directed edge between the node among the node set V, promptly
Figure BDA00001901988100011
The bar number of directed edge is m among the E; W is the set of node weights among the G, has characterized the influence power (initial value is set at 1, promptly only can influence node self) of each node.Given network chart G and initially enliven the node number K in the node set, having the greatest impact problem is from node set V, to select K best node as initially enlivening node set S, transmits through influence, the final scope maximum that makes the influence diffusion.How having the greatest impact the very corn of a subject is the most influential K name member in the fixer network, i.e. leader of opinion in the network, thus make that through viral marketing the number of users that finally is affected is maximum.The research of having the greatest impact problem not only has crucial realistic meaning to the marketing, also there is crucial application aspects such as public sentiment early warning, epidemic situation discovery simultaneously.Since Pedro Domingos and Matt Richardson proposed having the greatest impact problem in the article Mining the network value ofcustomers that calendar year 2001 ACM SIGKDD meeting is announced after, this problem had received more and more researchers' concern.People such as David Kempe have proved that having the greatest impact problem is under the jurisdiction of the NP-Hard problem in the article Maximizing tte Spread of Influence through a Social Network that ACMSIGKDD meeting in 2003 is announced, and have proposed a kind of greedy algorithm of climbing the mountain and obtain approximate optimal solution.Greedy algorithm can reach the best approximation (e be natural logarithm at the bottom of) of 1-1/e though climb the mountain; But because David Kempe employing Monte Carlo simulation (for example 20000 times) repeatedly calculates the influence value of each node; Therefore need to consume the plenty of time, and can't be extended in the large-scale network.
A lot of researchists are devoted to design the efficiency that new method solves having the greatest impact.Key problem in the greedy algorithm of climbing the mountain is that repeatedly Monte Carlo simulation is to calculate the influence value of all nodes.In order to address this problem; Among the article Cost-effective Outbreak Detection in Networks that people such as Jure Leskovec announce in ACM SIGKDD2007 according to the half module characteristics design that influences spread function new optimization method CELF; Can reduce the calculated amount of Monte Carlo simulation largely, thereby reduce computing time.Afterwards, people such as Wei Chen announce article Efficient Influence Maximization in Social Networks in ACM SIGKDD2009, proposed the greedy algorithm MixGreedy of present optimum in the article.The improvement of this algorithm is that be that all nodes calculate influence values in the network when each Monte Carlo simulation, thereby has further reduced the complexity of algorithm.MixGreedy has integrated the CELF algorithm simultaneously, greatly reduces algorithm execution time.Yet because having the greatest impact computation complexity is very high, even optimum at present MixGreedy algorithm is still very consuming time when handling large scale community network; For example from 37154 community network nodes, select 50 users the most influential just to need more than 2 hours.Therefore, how from the large scale community network mass user the most influential user of fast mining become problem demanding prompt solution.
On the other hand, (Graphics Processing Unit, the architecture of multinuclear multithreading high bandwidth GPU) makes GPU have superpower computation capability to GPU, is widely used in the general-purpose computations.Many graph-theoretical algorithms, for example breadth-first search, minimum spanning tree etc. can utilize the parallel ability of GPU to quicken to carry out.How to make full use of the computation capability of GPU, excavate the concurrent execution potentiality of having the greatest impact problem, designing based on the parallel accelerated method of the having the greatest impact of GPU architecture is the feasible program that solves having the greatest impact problem in the large scale community network.
In sum; The efficiency of having the greatest impact problem is the problem of extensive concern in the social network analysis; Present computing method can't reasonably accurately oriented the most powerful user in the time, and have very poor extensibility, can't be applicable to large scale community network.Therefore, research is efficient and have having the greatest impact of good extendability and dissolve certainly that method is the technical matters that those skilled in the art very pay close attention to.The method that does not have open source literature to relate in existing the having the greatest impact Study on Problems to utilize the computation capability of GPU to reduce working time.
Summary of the invention
The technical matters that the present invention will solve is: to the having the greatest impact problem in the community network; A kind of novel having the greatest impact parallel method based on GPU is proposed; But the abundant computation capability of excavating the parallel section in the greedy algorithm and utilizing GPU is to reach the purpose that accelerating algorithm is carried out, reduced the execution time.
In order to solve the problems of the technologies described above; Technical scheme of the present invention is: in each Monte Carlo simulation; At first find the strong connected component in the network chart; Because the influence value of each node is identical in the same strong connected component, so all nodes in the same strong connected component are merged into a node, its weight is each node weights sum in this strong connected component; Adopt the strategy of bottom-up traversal then, the influence value of each node of parallel computation.Utilize the computation capability of GPU, adopt separately thread to different nodes parallel computation influence value by each GPU computation core.Through farthest dispatching the parallel computation in the computation core of GPU of each node, reduce the overall operation time.
Concrete technical scheme is:
The first step: having the greatest impact of initialization node set S is empty.
Second step: set current Monte Carlo simulation times N um=0.
The 3rd step: the Monte Carlo simulation methods of people in the article Efficient Influence Maximization in Social Networks that ACM SIGKDD2009 announces such as employing Wei Chen are selected the limit to figure, obtain figure G '.
The 4th step: seek the strong connected component among the figure G '.In digraph, if two node v eAnd v fBetween both had one from v eTo v fDirected walk, simultaneously have one again from v fTo v eDirected walk, then claim v eAnd v fThe strong connection.If per two nodes all are communicated with by force in the digraph, then this figure is a strongly connected graph.Adopt Robert Tarian to equal the Tarjan algorithm that proposes among the article Depth-first search and linear graph algorithm of SIAM Journal on Computing magazine announcement in 1972, seek all strong connected component SCC among the figure G ' based on depth-first search i, the i value is from 0 to j-1, and j is the number of the strong connected component among the figure G '.
The 5th step: according to each strong connected component SCC of figure G ' i, will scheme G ' and change directed acyclic graph G into *, method is:
5.1: initialization i=0.
5.2: with strong connected component SCC iUse new node v N+iReplace, wherein n is the node number among the figure G '.
Concrete grammar is:
5.2.1: for strong connected component SCC i, newly-increased node v N+iNode v N+iThe limit set of going into be changed to SCC iIn all nodes go into limit union of sets collection, go out limit set and be SCC iIn all nodes go out limit union of sets collection, weight is each node weights sum in this strong connected component.
5.2.2: with strong connected component SCC iIn all nodes go into limit set and go out the limit set to put sky, weight zero setting.Method is:
5.2.2.1: initialization integer variable l is 0.
5.2.2.2: for strong connected component SCC iMiddle node v l, with node v lGo into limit set and go out the limit set to be changed to empty set
Figure BDA00001901988100041
Weight is changed to 0.
5.2.2.3:l=l+1。If l<n i, n wherein iBe strong connected component SCC iThe node number, then change 5.2.2.2.If l>=n i, change 5.3.
5.3:i=i+1。If < j changes 5.2 to i.If i>=j explains that then all strong connected components are all replaced by new node, scheme G this moment and change for directed acyclic graph G *, carried out for the 6th step.
The 6th step: from out-degree is that 0 node begins bottom-up traversal directed acyclic graph G *In all nodes, utilize GPU to calculate the influence value of all nodes.Concrete grammar is:
6.1: the definition of variable and initialization.Method is:
6.1.1: use boolean array Visited [] to write down each node and whether visited Visited [v p] equal true and represent node v pVisited Visited [v p] equal fajse and represent node v pDo not visited, wherein 0≤p≤n-1.Array Visited [] all is initialized as false, representes that all nodes are not all visited;
6.1.2: use integer array Count [] to write down the child node number that each node has been visited, wherein 0≤Count [v x]≤outdegree [v x], 0≤x≤n-1, outdegree [v x] be node v xOut-degree.Array Count [] all is initialized as 0, and expression is not all visited;
6.1.3: use integer array Inf [] writes down the influence value of each node, wherein 0≤Inf [v x]≤n, 0≤x≤n-1.Array Inf [] all is initialized as 0;
6.1.4: use character string array Label [] writes down the label of each node, label Label [v x] mark node v xThe position that possibly overlap with other nodes, wherein node v aAnd node v bBe overlapped in node v cAnd if only if from node v aAnd v bAll exist at least one path can reach node v c, 0≤a, b, c≤n-1.Array Label [] all is initialized as NULL.
6.1.5: whether use Boolean variable Stop record thread to calculate and accomplish, Stop, Stop equals false and representes not accomplish if equaling true and representing that all node influence values calculating are accomplished in this time simulation.Stop is a global variable, and all GPU threads all can be revised its content.Initialization Stop is false.
6.2: if stop to indicate Stop is false, explains that this time simulating influence value calculating remains unfulfilled, and then changes 6.3 and uses the GPU multi-threaded parallels to calculate; If Stop is true, the influence value of all nodes had all calculated and has finished during then explanation was this time simulated, and changeed for the 7th step.
6.3:GPU adopt the executive mode of single instruction stream multiple data stream, with the influence value of the mode computing node of multi-threaded parallel; The mode of multi-threaded parallel is meant: GPU distributes a thread computes influence value for each node, and GPU is once calculated the influence value of y node by y thread parallel, and y is the stream handle number (because of GPU model difference stream handle number difference) among the GPU; After the influence value of the current y of a GPU node calculated completion, if also have the node influence value not calculate, then GPU calculated the influence value of residue node with the mode of multi-threaded parallel through the GPU thread scheduling; Influence value calculating until all nodes finishes; GPU adopts the executive mode of single instruction stream multiple data stream, the shared same instruction fetching component of all stream handles in the same stream handle unit, and instruction is emission in order; There is not branch prediction; Be that different threads is carried out same instruction, but handle different pieces of information, thereby reach parallel computation.The thread computes node v of GPU pThe method of influence value is:
6.3.1: will stop to indicate that Stop is changed to true.
6.3.2: if Visited is [v p] equal false, carry out 6.3.3; Otherwise node v is described pVisited, changeed 6.2.
6.3.3: if Count is [v p] equal node v pOut-degree, node v then is described pAll child nodes all visited, carry out 6.3.4 computing node v pInfluence value; Otherwise node v is described pChild node in still have untreated node, will stop then indicating that Stop is changed to false, change 6.2.
6.3.4: computing node v pThe summation sum of all child node influence values,
Figure BDA00001901988100061
Out [v wherein p] be node v pThe set of all child nodes.
6.3.5: computing node v pLabel Label (v p).Node v pLabel label (v p) equal node v pAll child nodes to v pThe union of contribution, promptly
Figure BDA00001901988100062
Con (v wherein q) be child node v qTo node v pContribution.Child node v qTo node v pContribution be meant: if child node v qIn-degree greater than 1, then overlappingly possibly betide v q, this moment v qTo node v pContribution be node v qSelf, i.e. Con (v q)=v qIf child node v qIn-degree smaller or equal to 1, overlappingly can not betide v q, this moment v qContribution be node v qLabel, i.e. Con (v q)=Label (v q).
6.3.6: computing node v pSet out [the v of all child nodes p] overlapping influence value Overlap (out [v p]), method is:
6.3.6.1: initialization Overlap (out [v p]) be 0, the overlapping scope set of initialization Range is out [v p].
6.3.6.2: for arbitrary node v a∈ Range is if exist node v b∈ Range and v b≠ v a, and from node v aExist the path can reach node v b, this moment the overlapping node v that occurs in bSo, Overlap (out [v p])=Overlap (out [v p]+Inf [v b], simultaneously with v bFrom Range, delete, i.e. Range=Range-v b
6.3.6.3: using the crowded item among character string array Extra [] the record Range, initially is Overlap (out [v with Extra [] p])=Overlap (out [v p])+(Overlap (Filter)-Overlap (Extra)).
6.3.7: computing node v pInfluence value Inf [v p], Inf [v p]=sum+weight (v p)-overlap (out [v p]), weight (v wherein p) be node v pWeight.Totallnf [v p]=Totallnf [v p]+Inf [v p], TotalInf [v wherein p] be R Monte Carlo simulation node v pTotal influence value.R always simulates number of times, generally is set at 20000.
6.3.8: if node v pNo father node then changes 6.3.9; Otherwise, for node v pAny father node v s, it has been visited child node number Count [v s] add 1, i.e. Count [v s]=Count [v s]+1, and will stop to indicate that Stop is changed to false.
6.3.9: with node v pBe labeled as and visited, be i.e. Visited [v p]=true changes 6.2;
The 7th step: um adds 1 with the Monte Carlo simulation times N.Whether judge Num less than R, if < R changeed for the 3rd step to Num, otherwise carried out for the 8th step.
The 8th step: all nodes among the pair set V-S, select the maximum node v of TotallnF [] to join in the S set.
The 9th step: if the node number of S set | S|<K, changeed for second step, otherwise selected K the node the most influential of explanation finishes.
Compared with prior art, adopt the present invention can reach following beneficial effect:
1. the strong connected component in the present invention the 4th step calculating chart; Because the influence value of all nodes is identical in the strong connected component; Therefore in the 5th step, each strong connected component is replaced with individual node, thereby figure is converted into directed acyclic graph, can significantly reduce the calculated amount of influence value.
2. the present invention adopted bottom-up traversal method to calculate influence value for each node in the 6th step.Because the influence value of father node directly depends on the influence value of its all child nodes, therefore can only travel through the influence value that can obtain all nodes with a full figure through method, reduced assorted degree.
3. the present invention has fully excavated the concurrency of former greedy algorithm and the computation capability of GPU, especially calculates for each node distributes a GPU thread.Utilize the executed in parallel between the GPU thread, reduced the program implementation time significantly, thereby can handle, be with good expansibility more massive community network.
Description of drawings
Fig. 1 is optimum greedy algorithm MixGreedy process flow diagram;
Fig. 2 is an overview flow chart of the present invention.
Embodiment
Fig. 1 is optimum greedy algorithm MixGreedy process flow diagram.
The first step: initialization node set S is empty.
Second step: set current Monte Carlo simulation times N um=0.
The 3rd step: adopt the Monte Carlo simulation method that figure is selected the limit, obtain figure G '.
The 4th step:, calculate the influence value of each node for each node carries out breadth-first search.
The 5th step: um adds 1 with the Monte Carlo simulation times N.Whether judge Num less than R,, otherwise carried out for the 6th step if Num<R then changeed for the 3rd step.
The 6th step: select the node v of TotallnF [] influence value maximum among the set V-S to join in the S set.
The 7th step: if the node number of S set | S|<K, then changeed for second step, otherwise selected K the node the most influential of explanation, EOP (end of program) is withdrawed from.
Fig. 2 is an overview flow chart of the present invention.
The first step: initialization node set S is empty.
Second step: set current Monte Carlo simulation times N um=0.
The 3rd step: adopt the Monte Carlo simulation method that figure is selected the limit, obtain figure G '.
The 4th step: seek the strong connected component among the figure G '.
The 5th step: will scheme G ' according to each strong connected component and change directed acyclic graph G into *
The 6th step: from out-degree is that 0 node begins bottom-up traversal directed acyclic graph G *In all nodes, utilize the influence value of all nodes of GPU different threads parallel computation
The 7th step: um adds 1 with the Monte Carlo simulation times N.Whether judge Num less than R,, otherwise carried out for the 8th step if Num<R then changeed for the 3rd step.
The 8th step: select the node v of TotallnF [] influence value maximum among the set V-S to join in the S set.
The 9th step: if the node number of S set | S|<K, then changeed for second step, otherwise selected K the node the most influential of explanation, EOP (end of program) is withdrawed from.

Claims (3)

1. one kind based on the parallel accelerated method of the having the greatest impact of GPU, may further comprise the steps:
The first step: having the greatest impact of initialization node set S is empty;
Second step: set current Monte Carlo simulation times N um=0;
The 3rd step: adopt the Monte Carlo simulation method that figure is selected the limit, obtain figure G ';
It is characterized in that further comprising the steps of:
The 4th step: adopt the Tarian algorithm, seek all strong connected component SCC among the figure G ' based on depth-first search i, the i value is from 0 to j-1, and i is the number of the strong connected component among the figure G ';
The 5th step: according to each strong connected component SCC of figure G ' i, will scheme G ' and change directed acyclic graph G into *, method is:
5.1: initialization i=0;
5.2: with strong connected component SCC iUse new node v N+iReplace, wherein n is the node number among the figure G ';
5.3:i=i+1, if < j changes 5.2 to i; If i≤j carried out for the 6th step;
The 6th step: from out-degree is that 0 node begins bottom-up traversal directed acyclic graph G *In all nodes, utilize the GPU different threads to calculate the influence value of all nodes, thread number is that the thread of p is responsible for computing node v pInfluence value, 0≤p≤n-1 wherein, concrete grammar is:
6.1: definition and initializing variable, method is:
6.1.1: use boolean array Visited [] to write down each node and whether visited Visited [v p] equal true and represent node v pVisited Visited [v p] equal false and represent node v pDo not visited, array Visited [] all is initialized as false, represented that all nodes are not all visited;
6.1.2: use integer array Count [] to write down the child node number that each node has been visited, wherein 0≤Count [v x]≤outdegree [v x], 0≤x≤n-1, outdegree [v x] be node v xOut-degree; Array Count [] all is initialized as 0, and expression is not all visited;
6.1.3: use integer array Inf [] to write down the influence value of each node, wherein 0≤, Inf [v x]≤n, 0≤x≤n-1 all is initialized as 0 with array Inf [];
6.1.4: use character string array Label [] writes down the label of each node, label Label [v x] mark node v xThe position that possibly overlap with other nodes, wherein node v aAnd node v bBe overlapped in node v cAnd if only if from node v aAnd v bAll exist at least one path can reach node v c, 0≤a, b, c≤n-1 all is initialized as NULL with array Label [];
6.1.5: use Boolean variable Stop record thread to calculate and whether accomplish; Stop equals true and representes that all node influence values calculating are accomplished in this time simulation; Stop equals false and representes not accomplish; Stop is a global variable, and all GPU threads all can be revised its content, and initialization Stop is false;
6.2: if stop to indicate Stop is false, changes 6.3; If Stop is true, changeed for the 7th step;
6.3:GPU adopt the executive mode of single instruction stream multiple data stream, with the influence value of the mode computing node of multi-threaded parallel; The mode of multi-threaded parallel is meant: GPU distributes a thread computes influence value for each node; GPU is once calculated the influence value of y node by v thread parallel, y is the stream handle number among the GPU, after the influence value of the current y of a GPU node calculates completion; If also have the node influence value not calculate; Then GPU calculates the influence value of residue node through the GPU thread scheduling with the mode of multi-threaded parallel, calculates until the influence value of all nodes to finish the thread computes node v of GPU pThe method of influence value is:
6.3.1: will stop to indicate that Stop is changed to true;
6.3.2: if Visited is [v p] equal false, carry out 6.3.3; Otherwise node v is described pVisited, changeed 6.2;
6.3.3: if Count is [v p] equal node v pOut-degree, node v then is described pAll child nodes all oneself is visited, carry out 6.3.4 computing node v pInfluence value; Otherwise node v is described pChild node in still have untreated node, will stop then indicating that Stop is changed to false, change 6.2;
6.3.4: computing node v pThe summation sum of all child node influence values,
Figure FDA00001901988000021
Out [v wherein p] be node v pThe set of all child nodes;
6.3.5: computing node v pLabel Label (v p), node v pLabel label (v p) equal node v pAll child nodes to v pThe union of contribution, promptly
Figure FDA00001901988000022
Con (v wherein q) be child node v qTo node v pContribution, child node v qTo node v pContribution be meant: if child node v qIn-degree greater than 1, v then qTo node v pContribution be node v qSelf, i.e. Con (v q)=v qIf child node v qIn-degree smaller or equal to 1, v qContribution be node v qLabel, i.e. Con (v q)=Label (v q);
6.3.6: computing node v pSet out [the v of all child nodes p] overlapping influence value Overlap (out [v p]), method is:
6.3.6.1: initialization Overlap (out [v p]) be O, the overlapping scope set of initialization Range is out [v p];
6.3.6.2: for arbitrary node v a∈ Range is if exist node v b∈ Range and v b≠ v a, and from node v aExist the path can reach node v b, this moment the overlapping node v that occurs in bSo, Overlap (out [v p])=Overlap (out [v p])+Inf [v b], simultaneously with v bFrom Range, delete, i.e. Range=Ranqe-v b
6.3.6.3: use the crowded item among character string array Extra [] the record Range, Extra [] is initialized as empty set
Figure FDA00001901988000023
Use and remove remaining single of crowded item among character string array Filter [] the record Range, Filter [] is initialized as empty set
Figure FDA00001901988000024
For arbitrary node v a∈ range is for arbitrary element u ∈ Label (v a), if element u has belonged to Filter, then element u is a crowded item, u is added among the Extra and with its influence value Inf [u] join Overlap (out [v p]), i.e. Extra=Extra ∪ u, Overlap (out [v p])=Overlap (out [v p])+Inf [u]; If u does not belong to Filter, then u is joined in the Filter array, i.e. Filter=Filter ∪ u;
6.3.6.4: because still possibly there is repetition in the element in Extra [] and the Filter [] array, so finish node v pThe eclipse effect value Overlap (out [v of all child nodes p]) need add the poor of both overlapping values, i.e. Overlap (out [v p])=Overlap (out [v p])+(0verlap (Filter)-Overlap (Extra));
6.3.7: computing node v pInfluence value Inf [v p], Inf [v p]=sum+weight (v p)-overlap (out [v p]), weight (v wherein p) be node v pWeight; Totallnf [v p]=Totallnf [v p]+Inf [v p], Totallnf [v wherein p] be R Monte Carlo simulation node v pTotal influence value; R always simulates number of times, and R is a positive integer;
6.3.8: if node v pNo father node then changes 6.3.8; Otherwise, for node v pAny father node v s, it has been visited child node number Count [v s] add 1, i.e. Count [v s]=Count [v s]+1, and will stop to indicate that Stop is changed to false;
6.3.9: with node v pBe labeled as and visited, be i.e. Visited [v p]=true changes 6.2;
The 7th step: um adds 1 with the Monte Carlo simulation times N, whether judges Num less than R, if < R changeed for the 3rd step to Num, otherwise carried out for the 8th step;
The 8th step: all nodes among pair set V-S, select the maximum node v of Totallnf [] to join in the S set;
The 9th step: if the node number of S set | < K changeed for second step to S|, otherwise selected K the node the most influential of explanation finishes.
2. the parallel accelerated method of a kind of having the greatest impact based on GPU as claimed in claim 1 is characterized in that in said the 5th step strong connected component SCC iUse new node v N+iThe method that replaces is:
5.2.1: for strong connected component SCC i, newly-increased node v N+i, node v N+iThe limit set of going into be changed to SCC iIn all nodes go into limit union of sets collection, go out limit set and be SCC iIn all nodes go out limit union of sets collection, weight is each node weights sum in this strong connected component;
5.2.2: with strong connected component SCC iIn all nodes go into limit set and go out the limit set to put sky, weight zero setting, method is:
5.2.2.1: initialization integer variable l is 0;
5.2.2.2: for strong connected component SCC iMiddle node v l, with node v lGo into limit set and go out the limit set to be changed to empty set
Figure FDA00001901988000031
Weight is changed to 0;
5.2.2.3:l=l+1; If l<n i, n wherein iBe strong connected component SCC iThe node number, then change 5.2.2.2; If l>=n i, finish.
3. the parallel accelerated method of a kind of having the greatest impact based on GPU as claimed in claim 1 is characterized in that said total simulation number of times R is 20000.
CN201210248732.3A 2012-07-18 2012-07-18 Influence maximization parallel accelerating method based on graphic processing unit Expired - Fee Related CN102819664B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210248732.3A CN102819664B (en) 2012-07-18 2012-07-18 Influence maximization parallel accelerating method based on graphic processing unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210248732.3A CN102819664B (en) 2012-07-18 2012-07-18 Influence maximization parallel accelerating method based on graphic processing unit

Publications (2)

Publication Number Publication Date
CN102819664A true CN102819664A (en) 2012-12-12
CN102819664B CN102819664B (en) 2015-02-18

Family

ID=47303774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210248732.3A Expired - Fee Related CN102819664B (en) 2012-07-18 2012-07-18 Influence maximization parallel accelerating method based on graphic processing unit

Country Status (1)

Country Link
CN (1) CN102819664B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103246554A (en) * 2013-04-10 2013-08-14 上海安路信息科技有限公司 Wiring method and wiring system on basis of graphics processing units
CN103501235A (en) * 2013-07-15 2014-01-08 中国航天标准化研究所 Complex system availability determination method based on cellular automaton
WO2015062387A1 (en) * 2013-10-29 2015-05-07 International Business Machines Corporation Selective utilization of graphics processing unit (gpu) based acceleration in database management
US9443034B2 (en) 2014-05-29 2016-09-13 Microsoft Technology Licensing, Llc Estimating influence using sketches
CN103501235B (en) * 2013-07-15 2016-11-30 中国航天标准化研究所 Complication system availability determination method based on cellular automata
CN106681927A (en) * 2017-01-09 2017-05-17 郑州云海信息技术有限公司 Method and device for generating test case
CN106681960A (en) * 2017-01-04 2017-05-17 中山大学 Acceleration method for solution of linear equation set with Monte Carlo method based on shared memory
CN107958032A (en) * 2017-11-20 2018-04-24 北京工商大学 A kind of effective dynamic network node influence power measure
CN108596824A (en) * 2018-03-21 2018-09-28 华中科技大学 A kind of method and system optimizing rich metadata management based on GPU
CN110099003A (en) * 2018-06-11 2019-08-06 电子科技大学 A kind of parallel Routing Optimization Algorithm under elastic optical network
CN110288507A (en) * 2019-05-06 2019-09-27 中国科学院信息工程研究所 A kind of multi partition strongly connected graph detection method based on GPU
CN110992442A (en) * 2019-12-18 2020-04-10 南京富士通南大软件技术有限公司 Greedy urban traffic map planarization method
CN111832714A (en) * 2019-04-19 2020-10-27 上海寒武纪信息科技有限公司 Operation method and device
CN112070223A (en) * 2020-08-17 2020-12-11 电子科技大学 Model parallel method based on Tensorflow framework
CN115374914A (en) * 2022-10-24 2022-11-22 北京白海科技有限公司 Distributed training method, parallel deep learning framework and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101887573A (en) * 2010-06-11 2010-11-17 北京邮电大学 Social network clustering correlation analysis method and system based on core point
US20110264870A1 (en) * 2010-04-23 2011-10-27 Tatu Ylonen Oy Ltd Using region status array to determine write barrier actions

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110264870A1 (en) * 2010-04-23 2011-10-27 Tatu Ylonen Oy Ltd Using region status array to determine write barrier actions
CN101887573A (en) * 2010-06-11 2010-11-17 北京邮电大学 Social network clustering correlation analysis method and system based on core point

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ALHADI BUSTAMAM等: "A GPU implementation of Fast Parallel Markov Clustering in Bioinformatics using ELLPACK-R Sparse Data Format", 《2010 SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING,CONTROL, AND TELECOMMUNICATION TECHNOLOGIES》 *
WEI CHEN等: "Efficient influence maximization in social networks", 《KDD"09 PROCEEDINGS OF THE 15TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING》 *
郭绍忠等: "基于GPU的并行最小生成树算法的设计与实现", 《计算机应用研究》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103246554A (en) * 2013-04-10 2013-08-14 上海安路信息科技有限公司 Wiring method and wiring system on basis of graphics processing units
CN103501235A (en) * 2013-07-15 2014-01-08 中国航天标准化研究所 Complex system availability determination method based on cellular automaton
CN103501235B (en) * 2013-07-15 2016-11-30 中国航天标准化研究所 Complication system availability determination method based on cellular automata
WO2015062387A1 (en) * 2013-10-29 2015-05-07 International Business Machines Corporation Selective utilization of graphics processing unit (gpu) based acceleration in database management
US9721322B2 (en) 2013-10-29 2017-08-01 International Business Machines Corporation Selective utilization of graphics processing unit (GPU) based acceleration in database management
US9727942B2 (en) 2013-10-29 2017-08-08 International Business Machines Corporation Selective utilization of graphics processing unit (GPU) based acceleration in database management
US9443034B2 (en) 2014-05-29 2016-09-13 Microsoft Technology Licensing, Llc Estimating influence using sketches
CN106681960A (en) * 2017-01-04 2017-05-17 中山大学 Acceleration method for solution of linear equation set with Monte Carlo method based on shared memory
CN106681927A (en) * 2017-01-09 2017-05-17 郑州云海信息技术有限公司 Method and device for generating test case
CN107958032B (en) * 2017-11-20 2020-11-13 北京工商大学 Effective dynamic network node influence measuring method
CN107958032A (en) * 2017-11-20 2018-04-24 北京工商大学 A kind of effective dynamic network node influence power measure
CN108596824A (en) * 2018-03-21 2018-09-28 华中科技大学 A kind of method and system optimizing rich metadata management based on GPU
CN110099003B (en) * 2018-06-11 2020-07-14 电子科技大学 Parallel routing optimization method under elastic optical network
CN110099003A (en) * 2018-06-11 2019-08-06 电子科技大学 A kind of parallel Routing Optimization Algorithm under elastic optical network
CN111832714A (en) * 2019-04-19 2020-10-27 上海寒武纪信息科技有限公司 Operation method and device
CN111832714B (en) * 2019-04-19 2023-11-17 上海寒武纪信息科技有限公司 Operation method and device
CN110288507A (en) * 2019-05-06 2019-09-27 中国科学院信息工程研究所 A kind of multi partition strongly connected graph detection method based on GPU
CN110992442A (en) * 2019-12-18 2020-04-10 南京富士通南大软件技术有限公司 Greedy urban traffic map planarization method
CN110992442B (en) * 2019-12-18 2024-04-05 南京富士通南大软件技术有限公司 Urban traffic map planarization method based on greedy
CN112070223A (en) * 2020-08-17 2020-12-11 电子科技大学 Model parallel method based on Tensorflow framework
CN115374914A (en) * 2022-10-24 2022-11-22 北京白海科技有限公司 Distributed training method, parallel deep learning framework and electronic equipment

Also Published As

Publication number Publication date
CN102819664B (en) 2015-02-18

Similar Documents

Publication Publication Date Title
CN102819664A (en) Influence maximization parallel accelerating method based on graphic processing unit
Liu et al. A non-linear analysis of the impacts of natural resources and education on environmental quality: Green energy and its role in the future
CN106970788B (en) A kind of object dependency relationship discovery method and system based on tense
Xu et al. Input–output networks offer new insights of economic structure
US9563411B2 (en) Flow analysis instrumentation
Karrer et al. Random graph models for directed acyclic networks
Wang et al. A system dynamics model analysis for policy impacts on green agriculture development: A case of the Sichuan Tibetan Area
Zheng et al. Improving the efficiency of multi-objective evolutionary algorithms through decomposition: An application to water distribution network design
WO2016090877A1 (en) Generalized maximum-degree random walk graph sampling algorithm
CN110110529B (en) Software network key node mining method based on complex network
CN104008163A (en) Trust based social network maximum influence node calculation method
Zhang et al. Urban resilience under the COVID-19 pandemic: A quantitative assessment framework based on system dynamics
Gao et al. An exact algorithm with new upper bounds for the maximum k-defective clique problem in massive sparse graphs
CN106355091A (en) Communication source positioning method based on biological intelligence
CN105589916A (en) Extraction method for explicit and implicit interest knowledge
García-Quismondo et al. Probabilistic guarded P systems, a new formal modelling framework
Fang Environmental footprints: Assessing anthropogenic effects
Giscard et al. Cycle-centrality in economic and biological networks
Sarbu et al. Multi-objective optimization of water distribution networks: An overview
Wu et al. Deviation between willingness and actual behavior regarding community participation in protected areas: A case study in Shengjin lake national nature reserve in China
CN106372147A (en) Method for constructing and visualizing heterogeneous thematic network based on text network
Zhang et al. Analysis on key nodes behavior for complex software network
Madsen et al. Bayesian networks with function nodes
Kong et al. A cpn-based information propagation model in online social networks
Geipel Dynamics of communities and code in open source software

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150218

Termination date: 20210718

CF01 Termination of patent right due to non-payment of annual fee