Hash function

Synonymhashvaluegenerallyreferstothehashfunction.

Theconceptandfunctionofthehashtable

Theelementsinthehashtablearedeterminedbythehashfunction.ThekeyKofthedataelementisusedasanindependentvariable,andthevaluecalculatedthroughacertainfunctionalrelationship(calledahashfunction)isthestorageaddressoftheelement.Itisexpressedas:

Addr=H(key)

Therefore,twomainproblemsneedtobesolvedbeforeestablishingahashtable:

⑴ConstructasuitableHashfunction

ThevalueofuniformityH(key)isevenlydistributedinthehashtable;

Simpletoimprovethespeedofaddresscalculation

⑵Conflict

Conflict:Inthehashtable,differentkeyvalues​​correspondtothesamestoragelocation.Thatis,thekeywordK1≠K2,butH(K1)=H(K2).Auniformhashfunctioncanreduceconflicts,butconflictscannotbeavoided.Afteraconflictoccurs,itmustberesolved;thatis,thenextavailableaddressmustbefound.

Methodstoresolveconflicts:

⑴Linkmethod(zippermethod).Storetherecordswiththesamehashaddressinalinearlinkedlist.Forexample,intheremaindermethod,thekeywordis(18,14,01,68,27,55,79),andthedivisoris13.Thehashaddressis(5,1,1,3,1,3,1),andthehashtableisshowninthefigure.

⑵Openaddressingmethod.Ifh(k)isalreadyoccupied,searchaccordingtothefollowingsequence:(h(k)+p⑴)%TSize,(h(k)+p⑵)%TSize,…,(h(k)+p(i))%TSize,...

Amongthem,h(k)isthehashfunction,TSizeisthelengthofthehashtable,andp(i)istheprobefunction.Onthebasisofh(k)+p(i-1))%TSize,ifaconflictisfound,theincrementp(i)isusedtoperformanewdetectionuntilnoconflictoccurs.Amongthem,accordingtothedifferentprobefunctionp(i),theopenaddressingmethodisdividedintolinearprobemethod(p(i)=i:1,2,3,...)andsecondprobemethod(p(i)=(-)1)^(i-1)*((i+1)/2)^2,thesequenceofdetectionis:1,-1,4,-4,9…),randomdetectionmethod(p(i):randomNumber),doublehashfunctionmethod(doublehashfunctionh(key),hp(key)Ifh(key)conflicts,thenusehp(key)tofindthehashaddress.Theprobesequenceis:h(k),h(k)+hp(k),...,h(k)+i*hp(k)).

(3)Bucketaddressingmethod.Bucket:Alargeenoughstoragespace.Bucketaddressing:Associateabucketforeachaddressinthetable.Ifthebucketisfull,openaddressingcanbeusedtodealwithit.Forexample,insertA5,A2,A3,B5,A9,B2,B9,C2,anduselinearprobingtoresolveconflicts.Asshown.

Howtoconstructahashtable

Directaddressing

Forexample:Thereisademographictablefrom1to100yearsold,whereageisthekeyWord,thehashfunctiontakesthekeyworditself.

Numericalanalysismethod

Thebirthdaydataofsomestudentsisasfollows:

Year.Month.Day

75.10.03

75.11.23

76.03.02

76.07.12

75.04.21

76.02.15

...

Afteranalysis,thefirstplace,thesecondplace,andthethirdplacearelikelytoberepeated.Takingthesethreeplaceswillincreasethechanceofconflict,sotrynottotakethefirstthreeplaces.,Thelastthreearebetter.

Squarethemiddlemethod

Takethemiddlefewdigitsafterthekeywordsquareasthehashaddress.

Foldingmethod

Splitthekeywordintoseveralpartswiththesamedigits(thedigitsofthelastpartcanbedifferent),andthentakethesuperimposedsumoftheseparts(roundup)Asahashaddress,thismethodiscalledfolding.

Forexample:everyWestern-languagebookhasaninternationalstandardbooknumber,whichisa10-digitdecimalnumber.Ifyouwanttouseitasakeytoestablishahashtable,whenthetypeoflibrarybookisnotWhenitreaches10,000,thismethodcanbeusedtoconstructafour-digithashfunction.

Divideandleaveremaindermethod

Theremainderobtainedbydividingthekeywordbyanumberpnotgreaterthanthelengthmofthehashtableisthehashaddress.

H(key)=keyMODp(p<=m)

randomnumbermethod

choosearandomfunction,taketherandomfunctionvalueofthekeyIsitshashaddress,ie

H(key)=random(key),whererandomisarandomfunction.Thismethodisusuallyusedwhenthekeywordlengthisnotequal.

Ifthehashfunctionandconflicthandlingmethodareknown,thestepstobuildthehashtableareasfollows:

Step1.TakeoutthekeyofadataelementandcalculateitinthehashtableThestorageaddressinD=H(key).IfthestoragespacewiththestorageaddressDisnotoccupied,thedataelementisstored;otherwise,aconflictoccurs,andStep2isexecuted.

Step2.Accordingtothestipulatedconflicthandlingmethod,calculatethestorageaddressunderthedataelementwhosekeyisthekey.Ifthestoragespaceofthestorageaddressisnotoccupied,saveit;otherwise,continuetoexecuteStep2untilastorageaddressthatisnotoccupiedbythestoragespaceisfound.

Conflict

Nomatterhowelaboratethehashfunctiondesignis,therewillbeconflicts,thatis,theresultsofthetwokeywordprocessingfunctionsaremappedinthesameposition.Therefore,therearesomeMethodcanavoidconflicts.

Zippermethod

Pulloutadynamiclinkedlistinsteadofastaticsequentialstoragestructure,whichcanavoidtheconflictofthehashfunction,butthedisadvantageisthatthedesignofthelinkedlististoocumbersome,whichincreasesthecomplexityofprogramming.Thismethodcancompletelyavoidtheconflictofthehashfunction.

Multi-hashmethod

Designingtwoormorehashfunctionscanavoidconflicts,butthereisstillachanceofconflict.Thebetterormorethefunctionisdesigned,theprobabilitycanbereducedMinimize(unlessthecharacteristoobad,itisalmostimpossibletoconflict).

Openaddressmethod

Theopenaddressmethodhasaformula:Hi=(H(key)+di)MODmi=1,2,...,k(k<=m-1)

Amongthem,misthetablelengthofthehashtable.diistheincrementalsequencewhenaconflictoccurs.Ifthevalueofdimaybe1,2,3,...m-1,calllineardetectionandthenhash.

Ifditakes1,thenaftereachconflict,movebackwardby1position.Ifditakes1,-1,4,-4,9,-9,16,-16,...k*k,-k*k(k<=m/2)

Itiscalledtheseconddetectionandthenhashing.Ifthevalueofdimaybeapseudo-randomnumbersequence.Itiscalledpseudo-randomdetectionandthenhashing.

Domainbuildingmethod

Assumingthatthevaluerangeofthehashfunctionis[0,m-1],setthevectorHashTable[0..m-1]asthebasictable,andSetupastoragespacevectorOverTable[0..v]tostoreconflictingrecords.

Related Articles
TOP