Linear regression

Basicmeaning

Instatistics,linearregression(LinearRegression)istheuseoftheleastsquarefunctioncalledlinearregressionequationtodeterminetherelationshipbetweenoneormoreindependentvariablesanddependentvariablesAregressionanalysisformodeling.Thisfunctionisalinearcombinationofoneormoremodelparameterscalledregressioncoefficients.Thesituationwithonlyoneindependentvariableiscalledsimpleregression,andthesituationwithmorethanoneindependentvariableiscalledmultipleregression.(Thisinturnshouldbedistinguishedbymultiplelinearregressionpredictedbymultiplerelateddependentvariables,ratherthanasinglescalarvariable.)

Inlinearregression,thedataismodeledusingalinearpredictivefunction.Andunknownmodelparametersarealsoestimatedthroughdata.Thesemodelsarecalledlinearmodels.ThemostcommonlyusedlinearregressionmodelingisthattheconditionalmeanofyforagivenvalueofXisanaffinefunctionofX.Inalessgeneralcase,thelinearregressionmodelcanbeamedianorsomeotherquantileoftheconditionaldistributionofygivenXasalinearfunctionofX.Likeallformsofregressionanalysis,linearregressionalsofocusesontheconditionalprobabilitydistributionofyforagivenvalueofX,ratherthanthejointprobabilitydistributionofXandy(inthefieldofmultivariateanalysis).

Linearregressionisthefirsttypeofregressionanalysisthathasundergonerigorousresearchandiswidelyusedinpracticalapplications.Thisisbecauseamodelthatlinearlydependsonitsunknownparametersiseasiertofitthanamodelthatnon-linearlydependsonitsunknownparameters,andthestatisticalpropertiesoftheresultingestimatesareeasiertodetermine.

Linearregressionmodelsareoftenfittedwithleastsquaresapproximation,buttheymayalsobefittedwithothermethods,suchasminimizing"fittingdefects"insomeotherspecifications(suchasleastabsoluteErrorregression),orminimizethepenaltyoftheleastsquareslossfunctioninbridgeregression.Onthecontrary,theleastsquaresapproximationcanbeusedtofitthosenonlinearmodels.Therefore,althoughthe"leastsquaresmethod"andthe"linearmodel"areCloselyconnected,buttheycannotbeequated.

Fittingequation

Leastsquaresmethod

Generallyspeaking,linearregressioncanbeobtainedbytheleastsquaresmethodtofinditsequation,whichcanbecalculatedfory=Thestraightlineofbx+a.

Generally,thereisoftenmorethanonefactorthataffectsy.Supposetherearex1,x2,...,xk,kfactors,usuallythefollowinglinearrelationshipcanbeconsidered:

Foryandx1,x2,...,xkmakenindependentobservationsatthesametimetoobtainnsetsofobservations(xt1,xt2,...,xtk),t=1,2,...,n(n>k+1),theysatisfytherelation:

Amongthem,isnotrelatedtoeachotherandisrelatedto

Randomvariableswiththesamedistribution.Inordertoexpresstheaboveformulawithamatrix,let:

Sothereis,andusetheleastsquaremethodtogetthesolutionof.Amongthem,iscalledthepseudo-inverseof.

Regressioncoefficient

Generally,thisvalueisrequiredtobegreaterthan5%.Formostbehaviorresearchers,themostimportantthingistheregressioncoefficient.Whentheageincreasesby1unit,thequalityofthedocumentwilldecrease-1020986units,indicatingthatolderpeoplewillhavealowerevaluationofthequalityofthedocument.Thecorrespondingtvalueofthisvariableis-2.10,theabsolutevalueisgreaterthan2,andthepvalueisalso<0.05,soitissignificant.Theconclusionisthatolderpeoplewillhavealowerevaluationofdocumentquality,andthiseffectissignificant.Onthecontrary,peoplewithricherdomainknowledgewillhaveahigherevaluationofthequalityofthedocument,butthiseffectisnotsignificant.Thisunderstandingofregressioncoefficientsistheprocessofhypothesistestingusingregressionanalysis.

Errorofregressionequation

Sumofsquareddeviations

,,

whererepresentsthesumofsquaresofy;risthecorrelationcoefficient,representingtheproportionofvariationexplainedbytheregressionline;meansthatitcannotbeexplainedbytheregressionlineThevariationofSSE.

Accordingtotherelationshipbetweentheregressioncoefficientandtheslopeofthestraightline,theequivalentformcanbeobtained:,wherebistheslopeofthestraightline

Usingthepredictedvalue

,whereistheactualmeasuredvalue,andisthepredictedvaluecalculatedaccordingtothestraightlineequation.

Uncertainty

Slopeb

Method1:Use

Method2:Bringtheslopebinto

Application

Mathematics

Linearregressionhasmanypracticaluses.Dividedintothefollowingtwocategories:

Trendline

Atrendlinerepresentsthelong-termtrendoftimeseriesdata.Ittellsuswhetheraparticularsetofdata(suchasGDP,oilprices,andstockprices)hasincreasedordecreasedoveraperiodoftime.Althoughwecanroughlydrawatrendlinebyobservingthepositionofthedatapointinthecoordinatesystemwiththenakedeye,amoreappropriatemethodistouselinearregressiontocalculatethepositionandslopeofthetrendline.