Vector RAG

In the last lesson, you learned about Retrieval Augmented Generation (RAG) and the role of retrievers in finding relevant information.

One of the challenges of RAG is understanding what the user is asking for and finding the correct information to pass to the LLM.

In this lesson, you will learn about semantic search and how vector indexes can help you find relevant information from a user’s question.

Semantic search aims to understand search phrases' intent and contextual meaning, rather than focusing on individual keywords.

Traditional keyword search often depends on exact-match keywords or proximity-based algorithms that find similar words.

For example, if you input "apple" in a traditional search, you might predominantly get results about the fruit.

However, in a semantic search, the engine tries to gauge the context: Are you searching about the fruit, the tech company, or something else?

The results are tailored based on the term and the perceived intent.

Vectors

You can represent data as vectors to perform semantic search.

Vectors are simply a list of numbers. For example, the vector [1, 2, 3] is a list of three numbers and could represent a point in three-dimensional space.

A diagram showing a 3d representation of the x

You can use vectors to represent many different types of data, including text, images, and audio.

The number of dimensions in a vector is called the dimensionality of the vector. A vector with three numbers has a dimensionality of 3.

Embeddings

When referring to vectors in the context of machine learning and NLP, the term "embedding" is typically used. An embedding is a vector that represents the data in a useful way for a specific task.

Each dimension in a vector can represent a particular semantic aspect of the word or phrase. When multiple dimensions are combined, they can convey the overall meaning of the word or phrase.

For example, the word "apple" might be represented by an embedding with the following dimensions:

  • fruit

  • technology

  • color

  • taste

  • shape

When applied in a search context, the vector for "apple" can be compared to the vectors for other words or phrases to determine the most relevant results.

You can create embeddings in various ways, but one of the most common methods is to use an embedding model.

For example, the embedding for the word "apple" is 0.0077788467, -0.02306925, -0.007360777, -0.027743412, -0.0045747845, 0.01289164, -0.021863015, -0.008587573, 0.01892967, -0.029854324, -0.0027962727, 0.020108491, -0.004530236, 0.009129008, …​ and so on.

Reveal the completed embeddings for the word "apple"!
`0.0077788467, -0.02306925, -0.007360777, -0.027743412, -0.0045747845, 0.01289164, -0.021863015, -0.008587573, 0.01892967, -0.029854324, -0.0027962727, 0.020108491, -0.004530236, 0.009129008, -0.021451797, 0.002030382, 0.030813828, 9.744976e-05, 0.0019172973, -0.02568733, -0.020985752, -0.008066699, 0.02134214, -0.01222684, 0.0009980568, 0.005105939, 0.009999417, -0.000107408916, 0.015845545, -0.012980737, 0.020574536, -0.016160812, -0.018518453, 0.005263572, -0.019286057, -0.009293495, -0.012096621, -0.008854863, -0.005753605, -0.006157968, 0.010540851, 0.007724018, -0.0065554776, 0.00052944134, -0.023453051, 0.011089141, -0.021671113, -0.00061425474, -0.012754567, 0.015489157, -0.0054520466, -0.0020355221, -0.015050527, -0.0052944133, -0.0028082666, 0.0027431573, -0.019450543, 0.0063807103, -0.010725899, 0.0049243183, 0.005266999, 0.01513277, -0.027921606, 0.0055754115, -0.009183837, 0.00380718, -0.013624975, -0.0084710615, 0.012905347, 0.015667351, 0.033363372, 0.013268588, 0.014036193, 0.0063464423, 0.004454846, 0.0014820931, -0.03396649, -0.0062779062, -0.00314238, 0.01818948, 0.0075389706, -0.02637269, 0.009574492, 0.024974553, 0.024823774, 0.009882905, -0.021657405, 0.010109074, -0.007970748, 0.0028887964, 0.011849891, 0.0054726074, 0.0078336755, 0.016448664, -0.026975807, 0.016599443, -0.012713445, 0.026345275, 0.004667308, -0.03736588, 0.0009834929, 0.006089432, -0.028730331, -0.011198798, -0.020396343, 0.0019738395, 0.012459862, -0.003738644, 0.015448036, -0.019902883, 0.0064389664, 0.00926608, 0.021945259, -0.051648803, -0.016448664, -0.01744929, -0.009499103, 0.0021743076, -0.022795105, -0.035556525, 0.034021318, 0.025892938, 0.038407627, -0.008752059, 0.013446782, -0.0032640316, -0.01779197, -0.009567639, -0.0011205651, -0.013947096, 0.04707059, 0.008100967, 0.019491665, 0.016448664, -0.017846799, 0.019573908, -0.02223311, 0.015489157, -0.0057433248, -0.033445615, 0.010554559, 0.014694139, -0.01239818, 0.0070660715, -0.011226213, 0.023686076, 0.02360383, 0.022753984, -0.005215597, 0.0070866323, 0.010753313, -0.024110999, -0.003909984, 0.005462327, 0.0017459571, 0.0057981536, -0.016983245, -0.0021777344, -0.0039373985, 0.003772912, -0.006634294, 0.008614987, -0.006579465, -0.008841156, 0.0017699447, 0.024412557, 0.011856745, 0.013522171, -0.016051153, -0.00951281, -0.016133398, 0.004177275, -0.010691631, 0.01296703, 0.00886857, 0.016078569, 0.004434285, 0.012734006, -0.0067850733, 0.0006545197, 0.0011317023, -0.0046090526, 0.023096664, 0.01946425, -0.016640564, 0.014899747, 0.004701576, -0.010568266, 0.005530863, -0.019231228, 0.032047477, 0.02041005, -0.00397852, -0.014419994, -0.684703, -0.020643072, 0.00603803, -0.00033582686, 0.033993904, 0.03188299, 0.022287939, -0.0012739147, -0.018381381, -0.010396926, 0.0018042127, 0.0032863058, 0.00886857, 0.009519664, 5.9969083e-05, -0.022287939, 0.016284177, -0.023658661, -0.010431194, 0.02489231, -0.012261108, -0.014351458, -0.008841156, -0.029717252, 0.0036564006, 0.019628737, 0.019957712, -0.014022485, -0.019560201, 0.021767065, -0.008238039, -0.00048146606, 0.027291073, 0.0060140425, 0.037393294, 0.0072031436, -0.04416466, 0.013940242, 0.009663589, 0.03415839, -0.02065678, -0.020423757, 0.013563293, -0.0065246364, -0.015872959, -0.0009278074, 0.013254881, 0.005637094, -0.00071491714, -0.025344647, 0.03484375, 4.8269758e-05, 0.010787581, 0.008409379, 0.021780772, 0.008738352, 0.023124078, -0.008745206, -0.001522358, 0.016448664, -0.022370182, -0.0034011037, -0.034734093, -0.02523499, -0.020547122, 0.010636802, -0.009190691, 0.0076417746, 0.005434912, -0.01951908, 0.021492919, 0.022438718, -0.02306925, -0.007059218, -0.0031115387, 0.01705178, 0.023576416, -0.00148809, -0.027071757, 0.0047461246, -0.0023867695, -0.009389445, 0.0049414523, -0.027537804, 0.03158143, 0.0054246318, -0.024042463, -0.011301602, 0.013926535, -0.02371349, 0.034130976, 0.023932805, 0.0028682356, -0.019148985, -0.014570774, -0.0053423885, -0.032376453, -0.019244935, -0.0021434664, -0.019930298, 0.016530907, -0.0056302403, 0.00943742, 0.0067679393, 0.024028756, 0.013474196, -0.019477958, 0.014570774, 0.03673535, -0.020437464, -0.0076623354, -0.012631202, 0.008587573, -0.00869723, 0.025824402, -0.03125246, 0.010629948, -0.00761436, 0.021067996, -0.032952156, 0.025399476, -0.00438631, 0.011863599, 0.003027582, -0.01059568, 0.018463625, -0.0045405165, -0.030978315, -0.0034884873, -0.0059420797, 0.008018723, 0.0052190237, 0.007299094, -0.006250492, 0.02390539, 0.0004050055, 0.009965149, -0.020670487, 0.011993817, -0.02508421, -0.016969537, 0.007991308, 0.000463047, -0.00052258774, 0.0012704879, -0.01232279, -0.028511016, -0.016887294, -0.010862971, 0.0052361577, -0.008861717, 0.005530863, -0.0017579509, 0.021506626, 0.022589497, -0.015900373, 0.0028596686, -0.0233571, 0.0009406579, 0.016229348, 0.010205025, 0.028182043, -0.009026204, 0.0042218235, 0.0150368195, -0.035803255, 0.0068193413, 0.0018727488, -0.017846799, -0.029251205, 0.01340566, -0.016887294, -0.008190064, 0.008286014, -0.014748968, 0.0039888006, -0.0149682835, 0.007477288, 0.01015705, 0.002385056, 0.0054314854, 0.008861717, 0.0021023448, 0.0016602869, 0.030896071, 0.020053662, 0.0016157385, 0.04767371, -0.020218149, 0.0008228615, -0.013467343, 0.019820638, 0.0053252545, 0.0016525766, -0.013816877, -0.008477915, -0.0059592137, 0.013398807, -0.0009586486, 0.01150721, 0.023973927, -0.0029007902, 0.011246773, -0.0022873923, 0.013775756, -0.03292474, 0.003995654, -0.005369803, 0.011294749, 0.03459702, -0.0022771119, -0.028593259, -0.0066068796, -0.020451171, 0.012357058, 0.034185804, 0.002359355, 0.012185718, 0.0009329476, -0.007984455, 0.0016688539, -0.0047666854, 0.00047204236, -0.0036769616, -0.0074567273, 0.0034833471, 0.010115928, 0.03328113, -0.003368549, -0.026071131, -0.0035535966, -0.004986001, -0.00934147, -0.0125215445, 0.004143007, 0.014872333, 0.004146434, -0.010979483, 0.02223311, -0.0009552218, -0.0140499, 0.014502238, 0.026687955, -0.0020286685, 0.007621214, -0.0132617345, 0.045946598, 0.008169503, -0.004143007, -0.0022634047, -0.003240044, -0.025769573, -0.030759, 0.010479169, -0.00090467645, -0.024618166, 0.02350788, 0.022397596, 0.022877349, 0.0408201, 0.0032965862, -0.0034679265, -0.012946469, 0.0059763477, -0.020286685, -0.00019372156, -0.001281625, -0.013672951, 0.0028082666, 0.004146434, 0.013316563, -0.0002972753, 0.024933431, -0.010218732, 0.0067473785, 0.00096807233, -0.017600069, 0.0047495514, 0.0053458153, -0.012453008, -0.021698527, -0.02745556, 0.009060472, 0.003961386, -0.006867317, 0.008950814, -0.028949646, -0.0059455065, -0.005777593, 0.014748968, -0.0032948728, 0.021629991, 0.008320282, 0.020094784, 0.020423757, -0.01380317, 0.031362116, -0.0109863365, 0.005198463, -0.0062025166, 0.00017980016, 0.004968867, -0.019477958, -0.003947679, 0.03942196, -0.0048317946, -0.00595236, -0.024357729, 0.012679177, -0.002345648, -0.025413183, 0.0046227598, -0.015996324, -0.01809353, -0.0029864605, 0.016558321, -0.0055034487, -0.017161438, 0.04071044, -0.0025855242, -0.012644909, -0.01788792, -0.014255508, 0.007943333, 0.06513671, 0.02542689, -0.0109520685, 0.023727197, -0.0055925455, 0.027674876, -0.011945842, -0.006791927, 0.029059304, -0.00075818057, -0.0014101302, -0.008806888, 0.014776383, -0.018449917, 0.023891684, 0.011294749, -0.002393623, -0.020135906, -0.0056816423, -0.008203771, 0.00051230734, -0.014598188, 0.010650509, 0.0055205827, 0.01720256, 0.0057638856, 0.018751476, 0.029196376, -0.005195036, -0.024535922, -0.0060825786, -0.006243638, 0.015297256, -0.006226504, -0.001954992, 0.022301646, 0.017161438, 0.015955202, 0.0059489333, 0.0052601453, 0.012178864, 0.010616241, -0.0037249369, -0.02637269, 0.007792554, -0.011459235, -0.014611895, 0.032568354, -0.0012088054, -0.013810024, 0.024672994, 0.01627047, 0.0050511104, -0.0055891187, -0.00022102891, 0.026729077, -0.0074704345, 0.0031526603, 0.010307829, -0.025659915, -0.0055377167, -0.019998834, 0.0032880192, 0.014502238, -0.0012936188, -0.005650801, -0.011376992, -0.018669233, -0.0068536093, -0.011616868, -0.000986063, -0.026358983, -0.011390699, 0.0077308714, 0.033144057, 0.008217478, 0.020889802, 0.0057261907, 0.0069838283, 0.03489858, -0.008306575, -0.014803797, 0.004742698, -0.014474823, -0.022973299, 0.019094156, -0.001972126, -0.013145223, 0.011671697, 0.008649255, 0.013755195, -0.0060448837, 0.02958018, 0.0045028217, -0.0120897675, -0.00046004856, 0.017833091, 0.011986963, -0.019327179, -0.011829331, 0.00795704, -0.010410633, -0.0026334994, -0.008005016, 0.014666725, 0.014653017, 0.019738395, 0.012535252, -0.025276111, 0.0037146565, 0.02760634, -0.004441139, 0.014831211, -0.0109863365, 0.01222684, 0.0138305845, -0.008786327, -0.0074156057, -0.0052190237, -0.015900373, -0.02099946, -0.04997652, 0.014255508, 0.02094463, -0.014104729, 0.020464879, -0.004986001, -0.007970748, -0.020889802, 0.012219986, -0.008710938, -0.0025820974, -0.0013553012, -0.013857999, -0.033555273, -0.027016928, -0.01646237, 0.020862387, 0.0009629321, -0.017435582, -0.020272978, 0.018271724, 0.008155796, -0.024878602, -0.02834653, -0.049181502, 0.011431821, 0.003176648, 0.0035056213, 0.02952535, -0.015283549, 0.017572654, -0.006905012, 0.014214386, -0.026208203, -0.022164574, -0.028428772, 0.00012647052, 0.03829797, 0.018258017, 0.020423757, 0.014077314, 0.016640564, -0.00020646499, 0.0044616996, -0.008587573, 0.0029898873, 0.012219986, -0.018518453, 0.013679804, 0.014557066, 0.015859252, 0.0027071757, 0.012919054, -0.0039750934, 0.012788836, 0.0042560915, -0.0023353675, -0.027990142, -0.005404071, -0.004451419, -0.009444274, -0.019848052, 0.01008166, 0.0092455195, -0.024316607, 0.019162692, 0.009087887, 0.0017819385, -0.02922379, 0.025043089, -0.009972002, 0.021328432, 0.01141126, 0.0053903637, -0.026701663, -0.006685696, 0.008827449, -0.007477288, 0.015146477, -0.0068775974, 0.007792554, -0.014515945, -0.0074361665, 0.0058358484, 0.041149072, -0.025591379, -0.022356475, 0.0068570366, -0.04188926, -0.0053766565, -0.006411552, -0.009663589, -0.016092276, 0.001164257, 0.013556439, 9.952459e-06, 0.0003868006, -0.0058358484, -0.017367046, 0.0061682486, 0.020135906, 0.029991396, 0.0025769572, 0.035227552, 0.021602577, -0.0034576461, -0.019573908, 0.0022548377, -0.009533371, -0.011610014, 0.026454933, 0.01488604, 0.012315936, -0.007209997, -0.0028511016, 0.0045370897, -0.010239293, -0.0096430285, 0.035008237, 0.01769602, 0.016188227, -0.027976435, -0.031115387, -0.01946425, 0.026729077, -0.0048352215, -0.002503281, -0.015091648, -0.03829797, -0.01116453, 0.026331568, -0.01232279, 0.019505372, 0.004180702, -0.013912828, 0.01513277, -0.011849891, -0.02489231, 0.00088068884, -0.0026095118, 0.02740073, -0.02405617, 0.018203188, -0.0012859085, 0.005318401, -0.006349869, -0.007758286, 0.004674162, 0.03169109, -0.02785307, -0.0008571296, 0.0026369262, 0.015077941, 0.010623095, -0.012103475, -0.022260524, -0.009204398, -0.0028733758, -0.027976435, 0.010013124, 0.0077788467, -0.021013167, -0.011150823, 0.008244893, -0.006247065, -0.0062402114, 0.0027979861, 0.01372778, -0.0007671759, -0.013426221, 0.016928416, -0.0016191653, 0.0033668356, 0.026975807, -0.0121240355, -0.010705338, 0.023768319, -0.020793851, 0.00081129605, 0.0079022115, 0.0023096665, -0.024028756, 0.009937734, -0.0037592049, -0.0038483017, 0.020204442, -0.019546494, -0.012267961, -0.004338335, 0.0074361665, 0.016201934, 0.0024775798, 0.0061339806, 0.013248027, -0.008532744, -0.0019669859, -0.012713445, -0.030183297, 7.549679e-05, -0.012473569, -0.002210289, 0.02075273, -0.003116679, -0.0025872376, -0.003793473, 0.007299094, 0.0136592435, -0.024522215, -0.03391166, -0.021410676, 0.020506, -0.01463931, 0.00017551666, -0.020643072, -0.002201722, -0.022109745, 0.003632413, -0.0009286641, 0.00044891142, 0.0027191697, 0.014666725, 0.013391953, 0.02386427, -0.009039911, 0.0021348994, -0.013837438, -0.021410676, -0.021602577, -0.0059146653, 0.0048729163, 0.017983872, 0.01961503, -0.021917844, -0.028839989, -0.00808726, -0.03983318, -0.03254094, -0.005739898, 0.013248027, -0.00070206664, 0.006140834, 0.010013124, 0.0055411435, 0.0063841376, 0.016791344, -0.047564052, -0.0010725899, 0.004989428, -0.020917216, 0.022370182, -0.022959592, -0.020451171, -0.023233736, 0.001032325, 0.008094113, 0.0010777301, 0.01116453, 0.00038637224, -0.0033188604, -0.00886857, 0.022150867, 0.006394418, -0.00013310995, 0.009300348, -0.01883372, -0.009553932, 0.0032109162, -0.0007637491, -0.023727197, 0.0063258815, 0.009122155, 0.008327136, 0.008066699, 0.0013090394, -0.0051539144, 0.00975954, -0.020026248, -0.005873543, -0.011308456, -0.018765183, 0.014310337, -0.024412557, -0.017942749, -0.012535252, 0.010342097, -0.0243029, -0.010198171, 0.026838735, -0.0081078205, -0.0144337015, -0.010568266, 0.022301646, -0.03489858, -0.008066699, -0.0028802294, -0.023110371, -0.024193242, 0.03829797, 0.0029898873, -0.008361404, -0.0076280674, 0.014611895, 0.009560785, -0.0039716666, -0.004297213, 0.013446782, -0.022507254, -0.013337124, 0.008423086, -0.018600697, -0.023850562, 0.003947679, 0.0113838455, -0.0022788253, -0.0041909823, 0.20747247, -0.007059218, 0.016599443, 0.03988801, -0.0005011702, -0.0007568955, 0.015543986, 0.013145223, -0.0038825697, 0.0050339764, -0.014817504, 0.011767647, -0.015242428, 0.007299094, 0.010890386, -0.007580092, -0.03489858, -0.0089713745, -0.016393835, -0.0060825786, 0.023658661, -0.011459235, -0.011610014, -0.011514064, 0.02897706, 0.003108112, -0.02927862, 0.009889758, 0.018641818, 0.010150196, -0.00020453741, -0.004146434, -0.0039339717, -0.002090351, -0.008361404, -0.0001941499, -0.0075389706, 0.024165828, 0.02745556, 0.026920978, -0.0015789003, -0.00090638985, -0.007888504, -0.0035570234, -0.028127214, 0.0142966295, -0.008457354, -0.007360777, 0.023041835, 0.021753358, -0.047838196, -0.003755778, 0.025221283, 0.025111625, 0.0014692425, 0.0071346075, 0.0026900417, 0.012727153, -0.00223599, -0.0020423757, -0.00744302, 0.018998206, 0.0012841951, 0.019094156, -0.024330314, -0.0043074936, -0.034240633, 0.005839275, -0.009300348, -0.008738352, 0.0038654357, -0.020739023, -0.007545824, 0.00035017662, -0.030128468, -0.0408201, 0.024083585, 0.026098546, 0.014598188, 0.022493547, -0.006867317, 0.009252373, -0.006140834, -0.0022942459, -0.006147688, -0.016667979, 0.03223938, -0.00544862, -0.0058872504, -0.003844875, -0.005582265, -0.015448036, 0.004454846, -0.02603001, 0.0056987763, 0.017421875, -0.015790716, 0.01946425, -0.01042434, -0.00070120994, -0.0040641907, -0.017956456, 0.01769602, -0.010095367, -0.008080406, 0.024069877, 0.0029898873, 0.009403152, 0.0057913, 0.006870744, -0.012809397, -0.011424967, 0.01256952, -0.011178237, 0.033829417, 0.009725272, -0.002683188, -0.029086718, 0.017956456, -0.0010940074, 0.0075526778, -0.01868294, 0.0020612231, 0.017517826, -0.01439258, -0.021150239, -0.020780144, 0.00021256898, 0.0167091, -0.028483601, -0.003478207, -0.0048043802, 0.004454846, 0.0034936275, 0.008752059, 0.0024930006, 0.004828368, -0.017654898, -0.0015009405, -0.009320909, 0.0013458775, 0.013816877, 0.020560829, 0.007319655, 0.0035433162, -0.0028168336, 0.002784279, -0.00032833073, -0.023343394, -0.021314725, -0.018792598, 4.789495e-05, -0.018792598, -0.006689123, 0.04213599, -0.01769602, -0.034076147, -0.027592633, -0.01084241, 0.013734634, -0.022753984, -0.01479009, 0.023110371, -0.011795062, -0.04150546, -0.007340216, -0.18016769, 0.027565219, -0.0068775974, 0.0007757429, 0.018299138, 0.0038003265, 0.01676393, 0.009807515, -0.0063601495, 0.0019224375, 0.021259896, 0.0033102934, -0.028922232, -0.011054873, 0.024015049, -0.011596307, -0.004824941, 0.015996324, 0.025166454, 0.011123409, 0.01642125, -0.010047392, 0.01414585, -0.019957712, 0.009999417, 0.023453051, -0.025673622, 0.0014469683, -0.012007524, -0.016284177, -0.014159557, -0.015297256, 0.011260481, 0.0115826, 0.0128299575, -0.007621214, -0.014022485, -0.012363912, 0.0014512518, 0.023644954, 0.02158887, 0.01971098, 0.0078336755, 0.004705003, 0.0062607722, 0.020190734, 0.02006737, -0.019107863, 0.011952695, -0.019327179, 0.019628737, -0.013556439, -0.0066137332, 0.027825655, 0.00047289906, 0.009649882, -0.015406914, -0.0034216645, -0.020684194, -0.0065554776, -0.01266547, -0.010753313, 0.02016332, -0.018806305, -0.0072579724, -0.016818758, -0.013762048, -0.0081078205, -0.032952156, 0.01661315, -0.012219986, -0.011514064, 0.03169109, -0.024261778, 0.0005153058, -0.0007594656, -0.01818948, 0.026098546, 0.007648628, -0.0021006314, -0.005918092, 0.02143809, -0.017380754, -0.00031376682, -0.0059455065, 0.012219986, -0.0068604634, 0.004283506, -0.027291073, -0.030238125, 0.017750848, -0.019327179, -0.003810607, -0.021602577, 0.021465505, 0.036707934, 0.011801915, 0.004382883, -0.0028151202, 0.0036461202, -0.0018761756, -0.0021880148, -0.030046225, 0.015763301, 0.03563877, -0.0028408212, -0.006127127, 0.01971098, 0.018902255, -0.0025152748, -0.002325087, 0.020889802, 0.031142801, 0.028894817, -0.007429313, 0.0017313932, 0.011438674, -0.025509134, 0.005842702, -0.011856745, 0.025056796, 0.0007873084, 0.019546494, 0.014611895, -0.005088805, -0.011116555, -0.09907578, -0.04421949, 0.009972002, 0.0136935115, 0.015297256, 0.025015675, -0.005164195, 0.022959592, -0.012487276, 0.038709186, 0.0028562418, -0.021396969, -0.00061596814, 0.0077308714, 0.0115826, -0.00037137998, -0.027674876, -0.011555186, -0.022630619, 0.013638683, -0.013851145, -0.016873587, -0.010444901, -0.019217521, -8.918393e-07, 0.00072348415, -0.035254966, 0.028894817, 0.03662569, 0.007038657, 0.030238125, -0.02153404, 0.021301018, -0.038078655, 0.0019464251, 0.007991308, -0.018724062, 0.00628476, 0.019930298, -0.028593259, -0.001396423, 0.0003814462, 0.015516572, -0.03001881, 0.010773874, -0.02213716, 0.00027500108, 0.0010991476, 0.012007524, -0.013241174, -0.013097248, 0.018710354, -0.0021211922, -0.014735261, 0.0070146695, -0.020862387, -0.014063607, 0.0059832013, -0.018737769, 0.004228677, 0.006229931, -0.019628737, -0.00041314415, 0.013556439, 0.022260524, 0.0019738395, -0.0149682835, -0.001852188, 0.004776966, -0.018614404, -0.0011445528, -0.012219986, -0.02681132, 0.0461385, -0.021136532, -0.0007084919, -0.019724688, -0.020204442, 0.01365239, -0.032869913, -0.0044308584, -0.030594513, 0.0014675291, -0.008190064, 0.012377619, -0.0052258773, -0.003896277, 0.0078062615, 0.0057124835, -0.034624435, 0.03328113, 0.0022394168, 0.025892938, -0.011925281, -0.025097918, -0.002141753, -0.011445528, -0.0019190107, 0.032020062, -0.01739446, -0.0038174605, -0.0042526647, -0.08059845, 0.021109117, -0.002631786, -0.0049071843, 0.0144337015, 0.0035673038, 0.015982617, -0.036762763, -0.0062402114, -0.0041361535, -0.022041209, 0.010760167, -0.0057810196, -0.010019978, -0.00223599, -0.024878602, 0.019532787, 0.005465754, 0.030621927, 0.016010031, 0.012761421, 0.011308456, 0.019286057, -0.001992687, -0.013028712, 0.00768975, -0.016654272, 0.0029367716, 0.0019464251, -0.020423757, 0.00803243, -0.006428686, -0.014419994, 0.04268428, -0.0003623846, -0.008190064, -0.0047975266, 0.0011676837, -0.00454737, 0.006805634, -0.0066582817, -0.01710661, 0.01788792, -0.018011287, -0.011013751, -0.012014378, -0.011246773, 0.011692258, 0.016476078, -0.013056126, 0.015955202, 0.025796987, -0.016325299, -0.017682312, -0.017983872, -0.054691803, 0.023987634, -0.0020166747, -0.0060311765, -0.016476078, -0.0011616868, 0.033198886, 0.015763301, -0.0074498737, 0.008251746, -0.008477915, -0.016489785, -0.015173892, 0.03234904, -0.019985126, 0.000744045, -0.021410676, 0.016791344, -0.015242428, -0.002912784, -0.0014058467, -0.004824941, -0.0035673038, -0.008320282, 0.025344647, 0.013076687, -0.004735844, -0.034130976, 0.017312218, 0.016832465, 0.017380754, -0.02508421, -0.00808726, 0.013522171, 0.012439301, 0.014707847, 0.017147731, 0.006517783, -0.0010854404, 0.013782609, 0.008512183, -0.009451128, -0.014378873, 0.010636802, 0.023891684, 0.01809353, -0.012946469, -0.014337751, -0.011644282, -0.0018453344, 0.012069207, 0.0038585821, -0.020478586, -0.011843038, 0.02208233, 0.022109745, 0.005753605, -0.005650801, 0.022904763, -0.02119136, 0.017462997, -0.0059283725, -0.008662962, -0.015585108, 0.035227552, 0.05249865, 0.007634921, 0.015489157, -0.012781982, 0.021026874, 0.013741488, 0.0053423885, -0.024330314, 0.018724062, -0.008450501, 0.008025576, -0.01824431, -0.014762675, -0.014173265, -0.020793851, -0.0004604769, 0.014214386, 0.020670487, -0.019656152, 0.072593436, -0.0074224593, -0.0040539103, 0.00272431, 0.006336162, 0.021013167, 0.006805634, 0.016681686, -0.019203814, -0.009848637, 0.012857372, 0.015077941, 0.011959549, -0.017929042, -0.009320909, -0.0033120068, -0.023192614, 0.008985083, -0.022603204, 0.0060003353, 0.025207575, 0.02445368, 0.008827449, -0.006007189, -0.027647462, -0.010602534, 0.011150823, -0.0067131105, -0.0045884917, -0.041286144, 0.019395715, -0.006212797, -0.053293668, -0.01912157, 0.018326553, -0.016530907, -0.011198798, 0.0027448707, 0.027784534, -0.0013390239, -0.024508508, 0.023754612, -0.021259896, -0.017257389, 0.022027502, -0.012103475, -0.013535879, -0.015667351, 0.0061511146

Embedding models

OpenAI’s text-embedding-ada-002 embedding model created this embedding - a vector of 1,536 dimensions.

LLM providers typically expose API endpoints that convert a chunk of text into a vector embedding. Depending on the provider, the shape and size of the vector may differ.

While it is possible to create embeddings for individual words, embedding entire sentences or paragraphs is more common, as the meaning of a word can change based on its context. For example, the word bank will have a different vector in river bank than in savings bank.

Semantic search systems can use these contextual embeddings to understand user intent.

Embeddings can represent more than just text. They can also represent entire documents, images, audio, or other data types.

You can use the distance or angle between vectors to gauge the semantic similarity between words or phrases.

A 3 dimensional chart illustrating the distance between vectors. The vectors are for the words "apple" and "fruit"

Words with similar meanings or contexts will have vectors that are close together, while unrelated words will be farther apart.

Vector RAG

This principle is employed in vector based RAG to find contextually relevant results for a user’s question.

An embedding model is used to create a vector representation of the source data.

A diagram showing data being processed by an embedding model to create a vector representation of the data. The data is then stored in a vector index.

When a user submits a question, the system:

  1. Creates an embedding of the question.

  2. Compares the question vector to the vectors of the indexed data.

  3. The results are scored based on their similarity.

  4. The most relevant results are used as context for the LLM.

A diagram showing a user question being processed by an embedding model to create a vector representation of the question. The question vector is then compared to the vectors of the indexed data. The most relevant results are used as context for the LLM.

Learn more

You can learn more about vectors, embeddings, and semantic search in the GraphAcademy course Introduction to Vector Indexes and Unstructured Data

Check Your Understanding

Which of the following best describes an embedding in the context of machine learning?

Hint

Embeddings are numerical representations that capture the meaning or context of data, such as words or sentences.

Solution

The correct answer is A series of numbers (vector) that represent the data.

Lesson Summary

In this lesson, you learned about vectors and embeddings, and how they can be used in RAG to find relevent information.

In the next lesson, you will use a vector index in Neo4j to find relevant data.