Therapeutic Antibodies – Hidde L. Ploegh, Martin D. Witte, Nicholas C. Yoder, Whitehead Institute for Biomedical Research

Abstract for “Using sortases for click chemistry handles to protein ligation”

“Methods, reagents, and modified proteins with click chemistry handles are provided for installing click chemistry handle on target proteins. Herein are also disclosed methods and procedures for creating and using chimeric proteins such as bi-specific antibodies. These proteins contain two proteins that have been conjugated via click chemical.

Background for “Using sortases for click chemistry handles to protein ligation”

“Protein engineering has become a popular tool in many areas. Controlled protein ligation is one engineering technique. Controlled protein ligation is a method that uses efficient preparation of synthetic amino acids. This can be difficult for many proteins. Recombinant technology can be used to create protein-protein fusions. This involves joining the C-terminus and N-termini of two proteins. To join proteins, you can also use intein-based protein ligation methods. This intein-mediated method of ligation requires that the target protein be expressed in a properly folded fusion with the intertinein. Protein ligation is severely limited by the limitations of both recombinant and native ligation technologies.

“The transpeptidation reaction that is catalyzed in sortases has emerged to be a common method of derivatizing proteins with different types of modifications. Target proteins with conventional sortase modifications are designed to have a sortase recognition pattern (LPXT (SEQID NO: 144)) at their C-termini. These artificial sortase substrates are incubated with synthetic aminopeptides that contain one or more Nterminal glycine residues, and a recombinant kindase. This results in a transacylation reaction, which causes the residues C-terminal of the threonine to be exchanged with the synthetic Oligoglycine Peptide. The protein C-terminus is ligated to its N-terminus.

“Some aspects relate to sortase mediated modification of proteins, in specific on the installation reactive chemical groups, such as click chemistry handles on protein sequences. Methods and reagents are available for installing reactive chemical groups on proteins. Modified proteins include proteins with a click chemistry handle at the C-terminal or N-terminal. Methods to combine two modified proteins according to this invention are also provided. These methods can be used to dimerize monomeric protein chains and to create chimeric proteins which combine the properties of heterologous single proteins (e.g. chimeric, bispecific antibodies).

“Some aspects provide compositions and reagents that allow for the C-terminal or N-terminal additions of click chemistry handle to proteins via a sortase Transacylation reaction. Some aspects of the invention offer methods to install a click chemistry hand at or proximal the C-terminus for a protein having a sortase recognition pattern (e.g., LPXT [SEQ ID NO. 144)) near its C-terminus. This invention provides methods to install a click chemical handle at the N-terminus a protein that contains one or more Nterminal glycine residues.

“Some embodiments, for example, provide a method to conjugate a target protein with a C-terminal click chemical handle. Some embodiments include providing the target protein a Cterminal sortase Recognition motif (e.g. LPXT (SEQID NO: 144)); or as a Cterminal fusion. The method may also include contacting the target protein using an agent such as a peptide or protein or a compound comprising one to ten N-terminal glycocine residues or an alkylamine group. Some embodiments allow for contact to be made in the presence a sortase enzyme, under conditions that permit the sortase and the click chemistry hand to transamidate target proteins. This conjugates the target protein to click-chemistry handle.

“Some embodiments offer a method for conjugating a target proteins to an N-terminal click chemical handle. The method may include providing the target protein 1-10 Nterminal glycine residues and/or an Nterminal alkylamine groups, such as for an N-terminal fuse. The method may also include contacting the target protein using a peptide that has a sortase recognition pattern (e.g., SEQ ID NO: 144)) and the click-chemistry handle. In some embodiments, the contact is done in the presence a sortase enzyme, under conditions that allow the sortase and peptide to transamidate target proteins. This conjugates the target protein to click-chemistry handle.

“Any chemical moiety is possible to be added to a protein by using the methods described in this invention. Click chemistry handles are of particular importance according to certain aspects of the invention. Click chemistry handles refer to chemical moieties that provide a reactive element that can participate in click chemistry reactions. Click chemistry reactions, as well as suitable chemical groups, are well-known to those skilled in the art. These include terminal alkynes and azides, strainedalkynes and dienes, alkoxyamines and carbonyls. In some instances, an alkyne and an azide can be used in click chemistry reactions.

“Some aspects provide modified proteins. For example, proteins with a click chemistry handle at the C-terminal or N-terminal. These proteins can be used to conjugate with other molecules such as proteins, nucleic acid, polymers, lipids or small molecules. Some embodiments include an antigen binding domain. This could be an antigen domain for an antibody, such as a camelid, single-domain, VHH domain, nanobody, ScFv or antigen-binding fragment.

“Some aspects provide click chemistry methods that allow for the conjugation or ligation of two protein molecules. One embodiment installs a first click chemical handle on the protein and then a second click chemicals handle on the second protein. The first click handle can form a covalent link with the second click. Some embodiments offer a way to post-translationally combine two proteins to create a chimeric prot. The method may include contacting a first protein that has been conjugated to first click-chemistry with a second protein that has been conjugated to second click chemistry under conditions that allow the first click chemical handle to react with second click click handle. This creates a chimeric proteins consisting of the two proteins linked by a covalent bond.

“The methods described herein enable the generation of Nterminus-to-N-terminus and C-terminus-to-C-terminus protein conjugation. This is not possible by recombinant methods (e.g. expression of protein fusions). For example, in some embodiments, the first click chemistry handle is conjugated to the N-terminus of the first protein, and the second click chemistry handle is conjugated to the N-terminus of the second protein, and the chimeric protein is an N-terminus-to-N-terminus conjugation of the two proteins. In other embodiments, the first click chemistry handle is conjugated to the C-terminus of the first protein and the second click chemistry handle is conjugated to the C-terminus of the second protein, and the chimeric protein is a C-terminus-to-C-terminus conjugation of the two proteins. Click handles can be used in some embodiments to link the C- and N termini of first and second polypeptides. This is an alternative to creating a fusion protein. This is especially useful e.g. if the fusion protein is large, toxic, difficult to purify, encoded with nucleic acids sequences that are difficult to clone or avoid cloning.

“Some embodiments provide chimeric protein, such as chimeric proteins created by post-translational combination of two proteins according to aspects. Some embodiments offer chimeric, bispecific antibodies that contain two antigen binding proteins. A bispecific, chimeric antigen may include a first antigen-binding fragment or antibody that contains a sortase recognition pattern and a second antigen-binding fragment or antibody fragment which includes a sortase recognition series. The first and second antibodies or fragments are then combined via click chemistry.

“It is important to note that the invention does not limit itself to the conjugation antigen-binding protein conjugations. Any protein can be paired with any molecule that has a suitable click chemical handle. These handles can be installed using the methods described herein, or other methods that are well-known to those skilled in the art. Some embodiments include chimeric proteins that contain a target protein having a sortase recognition pattern (e.g. LPXT (SEQID NO: 144)), and another molecule which is conjugated to the protein using click chemistry. Some embodiments generate a chimeric protein by placing a click chemical handle on the target proteins and then contacting the target proteins including the click-chemistry handle with the second mole. The second molecule contains a second click handle that can react with click chemistry handles of the target proteins to form a covalent bonds.

“Some embodiments include modified proteins. For example, proteins that contain a sortase-recognition motif (e.g. LPXT (SEQID NO: 144)), and a click chemical handle conjugated with the sortase recognize motif. This can be done, for instance, by attaching to one of the typease recognitionmotifs amino acids or using a linker. Some embodiments include an antigen binding domain, such as an antibody or fragment of an antigenbinding antibody. Examples of modified proteins are provided herein include an antigen-binding domain for a camelid antibody, or a fragment thereof, a VHHdomain, a single-domain antigen, an affibody and an anticalin. The click chemistry hand is located at the C-terminus, in some embodiments. In other embodiments it is at the N-terminus. The click chemistry handle can be selected from any of the following: terminal alkyne or azide, strainedalkyne and dieneophile; carbonyl, alkoxyamine; hydrazide; thiol; and alkene.

“Some embodiments provide kits that include one or more reagents for carrying out the methods described herein. In some embodiments, the invention includes a kit that contains one or more reagents useful in carrying out the methods described herein. The kit may include a first peptide that contains 1-10 glycine atoms or a terminal allylamine conjugated with a first click chemical handle. A second peptide which comprises 1-10 glycine atoms or a terminal kylamine conjugated with a second click chemical handle is also included. In these cases, the click chemistry handles of the first and second peptides can react. The kit may include a first peptide that contains a sortase-recognition motif conjugated with a first click chemical handle and a second one that includes a sortase-recognition motif conjugated with a second click chemical handle. These peptides are capable of reacting with each others using the click chemistry handles of the first and second. The kit may also include a sortase protein. Some embodiments include instructions for using a catalyst (e.g. a metal catalyst) and/or a reaction buffer.

The above summary is meant to provide an overview of some aspects of the invention and should not be taken to limit the invention. Further aspects, advantages and embodiments of the invention are described in this disclosure. Those skilled in the art will also be able to see further embodiments based on the present disclosure. All references cited herein and elsewhere are hereby included by reference.

Standard genetic approaches permit the production of protein combinations by fusions of polypeptides in a head-to-tail manner. However, some applications may benefit from constructions that are genetically unfeasible, such as site-specific linking of proteins via their N or C-termini. This is when biological activity requires a free terminus.

“Production and purification of fusion protein remains a challenging biotechnological problem. Both domains must adopt the same native fold to produce an active product. This is true regardless of whether they have modified any residues or regions. Genetic fusion is the most common method for creating fusion proteins. This involves combining the open reading frames or fragments of two proteins. Fusion proteins are often made up of defective folding products and partially folded proteins.

This problem can be circumvented by post-translational conjugation of natively folded and purified proteins, for example, using a ligation label. These methods use labeling at either the N- or C-terminus of modified proteins to create the adducts of the interest just like if one were making the genetic fusions. These sortase-catalyzed transacylation reaction allow for such site-specific labeling and the preparation of protein-protein fusions head-to-tail under native conditions. (Popp MW, Ploegh L (2011) Making and breaking peptidide bonds: Protein engineering using Sortase. Angew Chem Int Ed 50:5024-5032; Guimaraes C P et al. (2011) Modified choleratoxin can be used to identify host cell factors that are required for intoxication. J Cell Biol, 195:751-764. Popp MW, Antos JM, Grotenbreg M, Spooner EP, Ploegh L (2007) Sortagging is a flexible method for protein labeling. Nat Chem Biol 3 :707-708 (the entire contents of each are included herein by reference).

Standard sortase-ligation methods do not permit the production of protein-protein fusions that would be genetically impossible (Nterminus to Nterminus; Cterminus to Cterminus), but such unnatural liaisons could be very useful for the creation of bispecific antibodies and their fragments. This invention focuses on the fact that chemical ligation is required to achieve such fusions. Early chemical conjugation strategies used non-specific crosslinking via aminos or sulfhydryls. Kim J S, Raines T (1995) Dibromobimane was used as a fluorescent crosslinking agent. Analytical Biochemistry, 225:174-176, whose entire contents are included herein as a reference. This approach is limited in its usefulness because it lacks control over the location and stoichiometry for modification. Bioorthogonal chemistries, site-specific mutation, native chemical ligation and amber suppressorpyrrolysine-tRNA technology have enabled the creation of non-natural protein-fusions. These can be used to produce bivalent and multivalent antibody (Schellinger JG et al. (2012) A platform for chemical synthesis that crosslinks multivalent single-chain variable fragments. Org Biomol Chem 10:1521-1526; Natarajan A et al. (2007) Construction of diScFv by a trivalent alkyneazide 1,3 polar cycloaddition. Chem Commun 695-697. Xiao, Hamilton B S, Tolbert J (2010) Synthesis by Native Chemical Ligation of N-Terminally Linked Proteins and Peptide Dimers. Bioconjug Chemistry 21:1943-1947. The entire contents are included herein as a reference. The synthesis of structural analogs of ubiquitin dimers was achieved by a combination intein-based ligation and site-specific mutation. Chem Commun 48; 296; Weikart D, Mootz HD (2010) The generation of site-specific and enzymatically stable conjugates of recombinant proteins with Ubiquitin like modifiers by the Cu I?Catalyzed Alkyne Cycloaddition. ChemBioChem 11 :774-777. The entire contents are included herein by reference. Site-specific incorporation of propargyloxyphenylalanine facilitated the synthesis of GFP dimers (Schellinger J G et al. (2012) A platform for chemical synthesis that allows crosslinking single-chain variable fragments of multivalent single chains. Org Biomol Chem 10:1521-1526; and Bundy B C, Swartz J R (2010) Site-Specific Incorporation of p-Propargyloxyphenylalanine in a Cell-Free Environment for Direct Protein?Protein Click Conjugation. Bioconjug Chemistry 21:255-263. The entire contents of each are included herein by reference.

“Nevertheless, the synthesis bispecifics would benefit from a method which is orthogonal with the published methods. This allows for easy access to modified native proteins and allows for efficient non-natural conjugation. Orthogonal methods allow for the creation of complex protein structures (e.g. heterotrimers or higher order complexes) thanks to their availability. Herein are disclosed reagents, methods, and other information that allow the conjugation of proteins at the N- or Cterminus to other entities. This includes, but is not limited to, other protein structures. One of the methods for conjugating proteins is to add click chemistry handles using a sortase catalyzed transpeptidation process. These modified proteins can be added to a molecule with a reactive click chemical handle.

“Some aspects relate to the recognition of the ease with which the sortase transacylation reaction permits the easy installation of any number of substitutes at the Cterminus of a suitable modified protein. A transacylation reaction that is successful requires a suitable sortase recognition motif. This could be an LPXT (SEQID NO: 144) or LPXTG motif (SEQID NO: 2) in the target protein. It is also easy to design nucleophiles for use in a sortase-catalyzed reaction. A short run (e.g. 1-10) of glycine residues or an alkylamine suffices to allow it to proceed. A sortase transacylation strategy is a simple way to modify a target proteins. It also allows for easy synthesis and execution of the reaction under physiological conditions.

Some aspects of the invention recognize that nucleophiles used in sortase reactions can be modified to include biotin, detectable label (e.g. fluorophores), fatty acid, nucleic acids and lipids as well as carbohydrates, radioisotopes, radioisotopes and proteins with a suitablely exposed N-terminal stretch glycine residues. Some aspects of the invention also allow for nucleophiles to be used in sortase reactions that contain reactive chemical moieties. These include moieties or?handles? that are suitable for click chemistry, such as a copper-free click reaction. These nucleophiles include peptides containing 1-10 glycine amino acids (e.g. GGG), and any compound (e.g. A peptide that contains an alkylamine group and a click chemical handle can be used to place a Cterminal click chemistry control on a target protein that has a Cterminal sortase identification motif. It does not need to be at the C-terminus. However, it must be accessible enough by the enzyme to allow the sortase reaction to take place.

Click chemistry handles can also be installed Nterminally on proteins with a short glycolic run, or proteins or compounds containing an alkylamine (e.g. at their Nterminus for proteins), through a sortase reaction that uses a peptide containing a sortase identification motif and the desired click-chemistry handle. According to the invention, any protein that contains either a sortase detection motif, 1-10 glycine residues or a terminal allylamine group can be derivatized using a click chemical handle. Click chemistry reactivity is conferred by the installation of a click handle on a target proteins. A click chemistry hand, such as the one described herein, allows a protein to react with another molecule. This can form a covalent bond and bind the two molecules together.

“In some cases, reactive click chemistry handle proteins are combined by performing the appropriate click chemistry reaction. The proteins are then bound to one another via a covalent link. The inventive strategies enable the installation of click chemistry handles on either the N or C-terminus. This allows two modified proteins to be combined via a covalent bond. It works much like a traditional protein fusion. Installing C-terminal reactive click chemical handles on both target protein allows for the creation of proteins that are covalently bound at their C-termini. (C?toC-termini), C?C), while N-terminal reactive click chemistry handle on both target proteins allows the generation proteins that are conjugated at their N.termini (N?termini). Conventional protein engineering technologies such as recombinant proteins fusion technology cannot achieve covalent C?C or covalent N?N conjugation.

“Sortase Mediated Installation of Click Chemistry Handles”

“Sortases and sortase-mediated transacylation reaction are well-known to those skilled in the art. The transpeptidation reaction catalyzed with sortase results, in general, in the ligation species containing a transamidase-recognition motif with those that have one or more Nterminal glycine amino acids. The sortase identification motif may be described in certain embodiments. The sortase recognition symbol can be either an LPXT (SEQID NO: 144) or an LPXTG (2:2) motif in certain embodiments. The substitution of the C terminal residue of the recognition sequence by a moiety with poor nucleophilicity after it has been released from the sortase results in a more efficient ligation.

“The sortase-transacylation reaction allows for efficient linking of an acyl donor and a nucleophilic acceptor. This principle can be used to link many acyl donors with a variety of acyl acceptors. The sortase reaction was used previously to ligate proteins and/or their peptides, link a reporting molecule with a protein/peptide, join a nucleic acids to a peptide or protein, or conjugate a protein and peptide to an a solid support or polymer. It also linked a protein to a label. These products and processes are cost-effective and time-savings in ligation product synthesis. They also make it easy to link an acyl donor and an acylacceptor.

The transamidase activity in sortase catalyzes sotase-mediated transacylation reactions. A transamidase can form a linkage (i.e. amide linkage) between an amino donor compound and an nucleophilic acceptor with a NH2CH2-moiety. The sortase in some embodiments is sortase (SrtA). It should be noted, however, that sortase A (SrtA) is the sortase. They contain peptidoglycan, polysaccharides, and/or Teichoic acid as part of the cell wall structure. The following Gram-positive bacteria are included: Actinomyces. Bacillus. Bifidobacterium. Cellulomonas. Clostridium. Corynebacterium. Micrococcus. Mycobacterium. Nocardia. Staphylococcus. Streptococcus. Streptomyces.

“Sortase Mediated Installation of CTerminal Click Chemistry Handles

“In some embodiments, a sortasemediated transacylation reaction to install a C-terminal click chemical handle on a proteins comprises a step that contacts a protein consisting of a transamidase recognition sequencing of the structure:

“wherein”

“wherein”

“”

“Those skilled in the art will understand that the click chemical handle can be integrated into B1 in any way and in any position that is possible. For example, B1 could contain an amino acid (e.g. lysine), and the click chemical handle can be attached to either the central carbon or the side chain of the amin acid or the carboxyl group. The click chemistry handle can be incorporated into B1 in many other ways, as will those skilled in the art.

It will be further understood that depending on the nature B1, the click chemical handle may be installed at or near the C-terminus. If B1 contains a first amino acids comprising the click chemical handle and several additional amino acids, then the click handle may be installed at the very C-terminus of the target protein or, e.g. It will be obvious to those skilled in the art that a similar situation exists for N-terminal installation.

“A person of average skill will be able to appreciate that in certain embodiments, a C-terminal amino acids of the transamidase Recognition Sequence are omitted. This is an acyl group.

“replaces C-terminal amino acids of the transamidase Recognition Sequence. “In some embodiments, a acyl group is”

“In some embodiments the acyl group”

“”

“In some embodiments, sortase or transamidase recognition sequence is LPXT. (SEQ ID NO. 144). In which X is a standard and non-standard amino acids. Some embodiments allow X to be selected from D, E and A. Q. K. or R. In some embodiments, X is chosen to match a naturally occurring transamidase sequence. Some embodiments use a transamidase recognition sequence from LPKT, LPIT, SEQID NO. 49 and SPKT. Some embodiments (e.g. in those where sortase B is used) include the transamidase recognition sequence X1PX2X3(SEQ ID NO. 152). In this sequence, X1 is leucine or isoleucine; X2 any amino acid; and X3 threonine or serine. P is proline, while G is glycine. As noted above, X1 is leucine while X3 is the threonine in certain embodiments. In some embodiments, X2 can be aspartate or glutamate, alanine and glutamine or methionine. Certain embodiments, such as those that use sortase B, often include the amino acid sequence (SEQ ID No: 153), in which X1 is glutamine, lysine, or glycine, X2 or asparagine, N is asparagine, P is proline, and T is the threonine. This invention recognizes that X selection may be based at most in part on desired properties of the compound containing the recognition pattern. Some embodiments of the invention allow X to alter a property of the compound containing the recognition motif. For example, it may increase or decrease the solubility in a solvent. Some embodiments select X to be compatible in determining reaction conditions for the compound containing the recognition motif. For example, X may be selected to not react with any of the reactants in the synthesis.

“In some embodiments, the X is?O?”. In some embodiments, however, X may be?NR?. In some embodiments, however, X can be?NH?. In some embodiments, the X value is?S?”

“In some embodiments, R1 can be substituted for aliphatic. R1 can be substituted as an aliphatic in certain embodiments. In some embodiments R1 can be substituted for C1-12 aliphatic. In some embodiments R1 can be substituted C1-12 aliphatic. In some embodiments R1 can be substituted for C1-6 aliphatic. In some embodiments R1 can be substituted for C1-6 aliphatic. In some embodiments R1 is C1-3-aliphatic. In some embodiments R1 can be butyl. In some embodiments R1 can be n-butyl. R1 can be isobutyl in some embodiments. R1 can also be propyl in some embodiments. In some embodiments R1 is npropyl. R1 can be isopropyl in some embodiments. R1 can also be ethyl in some embodiments. In some embodiments R1 is methyl.

“In some embodiments, R1 can be substituted for aryl. In some embodiments, R1 can be substituted with aryl. In some embodiments, R1 can be substituted with phenyl. R1 can be substituted with phenyl in certain embodiments.

“A1 may contain a protein in some embodiments. In some cases, A1 contains a peptide. A1 may contain an antibody, an affibody or anticalin, a DARPin or a peptide in some instances. A1 may also include a D-amino acid recombinant protein or a protein that contains one or more D amino acids. A1 may be an amino acid sequence that contains at least three amino acids. In some embodiments, A1 includes a protein. A1 may also contain a peptide in some instances. A1 may also contain an antibody in some embodiments. A1 may also contain an antibody fragment. In some embodiments, A1 contains an antibody epitope. A1 may also contain green fluorescent protein. Some embodiments include ubiquitin.

“In some embodiments B1 includes a click-chemistry handle. B1 may include the click chemistry handle described in this article. In some embodiments, B1 includes a click-chemistry handle as described in Table 1, Table 2, or FIG. 2B. 2B. In certain embodiments, B1 may include a terminal alkyne or azide, strainedalkyne and dieneophile as well as an alkoxyamine. In some embodiments, B1 includes a click chemical handle as described in Table 1, Table 2, or FIG. 2B.”

“In some embodiments, n can be a number between 0 and 50 inclusive. In some embodiments, n can be a number between 0 and 20 inclusive. In some embodiments, n equals 0. In some embodiments, n equals 1. In some embodiments, n equals 2. In some embodiments, n equals 2. In some embodiments, n equals 4. In some embodiments, n equals 5. In some embodiments, n equals 5.

“Sortase Mediated Installation of N-Terminal Chemistry Handles”

“In some embodiments, a sortasemediated transacylation reaction to install an N-terminal click chemical handle on a proteins comprises a step that contacts a protein of structure:

“wherein”

“wherein”

“Those skilled in the art will understand that click chemistry handles can be integrated into A1 in any way and in any position that is possible. A1 could contain an amino acid (e.g., Lysine), and the click chemical handle can be attached to either the central carbon or the side chain or the amino group. The click chemistry handle can be incorporated into A1 in many other ways, as will those skilled in the art.

“A person of average skill will be able to appreciate that in certain embodiments, a C-terminal amino acids of the transamidase Recognition Sequence are omitted. This is an acyl group.

“replaces C-terminal amino acids of the transamidase Recognition Sequence. “In some embodiments, a acyl group is”

“In some embodiments the acyl group”

“”

“In some embodiments, sortase or transamidase recognition sequence is LPXT. (SEQ ID NO. 144). In which X is a standard and non-standard amino acids. Some embodiments allow X to be selected from D, E and A. Q. K. or R. In some embodiments, X is chosen to match a naturally occurring transamidase sequence. Some embodiments use a transamidase recognition sequence from LPKT (SEQID NO. 48), LPIT [SEQID NO. 49], LPDT [SEQID NO. 50], LPDT [SEQID NO. 52], LAAT (SEQID NO. 53], LAAT (SEQID NO. 54], LAET(SEQID NO. 55), LAET(SEQID NO. 56), LPLT (?SEQID NO. 65?), LPQT (?SEQID NO. 69?), NAKT (SEQID NO. 69?) and NPQSEQID NO.70) Some embodiments (e.g. those in which sortase B is used) include the transamidase recognition sequence X1PX2X3 (SEQ ID NOT: 152) in which X1 is leucine or isoleucine; X2 any amino acid; and X3 threonine or serine. P is proline, while G is glycine. As noted above, X1 is leucine while X3 is the threonine in certain embodiments. In some embodiments, X2 can be aspartate or glutamate, glutamate and alanine, or glutamine. Certain embodiments, such as those that use sortase B, often include the amino acid sequence (SEQ ID No: 153), in which X1 is glutamine, lysine, or glycine, X2 or asparagine, N is asparagine, P is proline, and T is the threonine. This invention recognizes that X selection may be based at most in part on desired properties of the compound containing the recognition pattern. Some embodiments of the invention allow X to alter a property of the compound containing the recognition motif. For example, it may increase or decrease the solubility in a solvent. Some embodiments select X to be compatible in determining reaction conditions for the compound containing the recognition motif. For example, X may be selected to not react with any of the reactants in the synthesis.

“In some embodiments, the X is?O?”. In some embodiments, however, X may be?NR?. In some embodiments, however, X can be?NH?. In some embodiments, the X value is?S?”

“In some embodiments, R1 can be substituted for aliphatic. R1 can be substituted as an aliphatic in certain embodiments. In some embodiments R1 can be substituted for C1-12 aliphatic. In some embodiments R1 can be substituted C1-12 aliphatic. In some embodiments R1 can be substituted for C1-6 aliphatic. In some embodiments R1 can be substituted for C1-6 aliphatic. In some embodiments R1 is C1-3-aliphatic. In some embodiments R1 can be butyl. In some embodiments R1 can be n-butyl. R1 can be isobutyl in some embodiments. R1 can also be propyl in some embodiments. In some embodiments R1 is npropyl. R1 can be isopropyl in some embodiments. R1 can also be ethyl in some embodiments. In some embodiments R1 is methyl.

“In some embodiments, R1 can be substituted for aryl. In some embodiments, R1 can be substituted with aryl. In some embodiments, R1 can be substituted with phenyl. R1 can be substituted with phenyl in certain embodiments.

“In some embodiments, B1 contains a protein. In some cases, B1 contains a peptide. Some embodiments of B1 include an antibody, an antibodies chain, an antibody fragment or an epitope, and an antigen-binding domain. In some embodiments, B1 includes a D-amino acid recombinant protein. This protein may include an antibody, an antibody chain, an antibody fragment, an antigen-binding protein domain, and an enzyme. B1 may be an amino acid sequence that contains at least three amino acids. In some embodiments, B1 is a protein. B1 may also contain a peptide in some embodiments. In some embodiments, B1 contains an antibody. In some embodiments, B1 contains an antibody fragment. B1 may also contain an antibody epitope in some embodiments. In some embodiments, B1 contains green fluorescent protein. Some embodiments include ubiquitin.

“In some embodiments A1 includes a click-chemistry handle. In some embodiments, A1 includes the click chemistry handle described in this article. A1 may contain a click-chemistry handle as described in Table 1, Table 2 or FIG. 2B. 2B. In certain embodiments, A1 may include a terminal alkyne or azide, strainedalkyne and dieneophile as well as an alkoxyamine. A1 may include a click chemical handle as described in Table 1 and Table 2 or FIG. 2B.”

“In some embodiments, n can be a number between 0 and 50 inclusive. In some embodiments, n can be a number between 0 and 20 inclusive. In some embodiments, n equals 0. In some embodiments, n equals 1. In some embodiments, n equals 2. In some embodiments, n equals 2. In some embodiments, n equals 4. In some embodiments, n equals 5. In some embodiments, n equals 5.

“Suitable Enzymes, Recognition Motifs”

“Transamidase can be a sortase in certain embodiments. Enzymes that are’sortases’ Enzymes identified as?sortases are enzymes that are cleaved and translocated proteins from Gram-positive bacteria to intact cell walls. Sortase A (Srt) and sortase (SrtB) are two of the sortases isolated from Staphylococcus Aureus. In certain embodiments, the transamidase used according to the present invention is a kindase A, e.g. from S. aureus. A transamidase can be a sortase from S. aureus.”

Based on sequence alignment and phylogenetic analyses of 61 sortases taken from Gram-positive bacteria genomes (Dramsi, Trieu Cuot P, Bierne, Sorting sortases, a nomenclature proposal to describe the different sortases of Grampositive bacteria), “Sortases were divided into four classes. Res Microbiol. 156(3):289-97, 2005. These classes correspond with the following subfamilies into which sortases were also classified by Comfort and Clubb (Comfort T, Clubb R). A comparative genome analysis has identified distinct sorting pathways within gram-positive bacteria. Infect Immun. 72(5):2710-22 2004: Class A (Subfamily 1) Class B (Subfamily 2), Class C (Subfamily 3), and Class D (Subfamilies 5 and 4). Numerous recognition motifs and sortases are revealed in the references. Also see Pallen M. J., Lam A. C., Antonio, M., and Dunbar K. TRENDS In Microbiology 2001, 9(3): 97-101. Anyone skilled in the art can easily assign a sortase the correct class based upon its sequence and/or any other characteristics, such as those described by Drami et. al. supra. The term “sortase A” is used herein. The term?sortase A? is used here to denote a class-A sortase. It is usually called SrtA in any specific bacterial species, e.g. SrtA from S. aureus. Also,?sortase A? It is also used to denote a class B typease, often called SrtB in particular bacterial species. The invention includes embodiments that relate to sortase B from any bacterial strain or species. The invention includes embodiments that relate to a sortase A from any bacterial strain or species. The invention includes embodiments that relate to a class B sortase from any strain or species of bacteria. The invention includes embodiments that relate to a class D typease from any bacterial strain or species.

“The amino acid sequences of SrtA and SrtB and the nucleotide sequences which encode them are well-known to those skilled in the art. They are disclosed in a number references cited herein. The entire contents of each of these references are included herein by reference. S. aureus SrtA’s and SrtB’s amino acid sequences are homologous. They share, for example, 22% of their sequence identity and 37% of their sequence similarity. A sortase transamidase sequence from Staphylococcus Aureus has a substantial homology to sequences from other Gram positive bacteria. These transamidases may be used in the ligation procedures described herein. SrtA has a 31% sequence identity and 44% sequence similarity. This is the best alignment across the entire sequenced area of the S. Pyogenes open-reading frame. The sequence identity of A. naeslundii is 28% with the best alignment across the entire region. You will appreciate that different bacterial strains might have different sequences of a particular protein, and these sequences are excellent examples.

“In certain embodiments, a transamidase bearing 18%, 20%, or more sequence identities with the S. phytogenes, A. naeslundii and S. nutans, E. foecalis, or B. subtilis open-reading frame encoding an sortase can also be screened. Enzymes having transamidase activity comparable or equal to Srt A from S. aureas (e.g., comparable activity sometimes is 10% or more of Srt A activity or Srt A activity or Srt A activity or Srt A activity or Srt A activity or Srt A activity).

“In some embodiments of this invention, the sortase (SrtA) is a sortase. SrtA recognizes LPXTG (SEQID NO: 2), with common recognition motifs including LPKTG, LPATG, LPNTG (SEQID NO: 97), and LPATG. LPETG (SEQID NO: 4) may be used in some embodiments. However, it is possible to recognize motifs that are not part of this consensus. In some cases, the motif may include an?A? Some embodiments include an?A? rather than a?T? motif. at position 4, e.g. LPXAG (SEQID NO: 98), e.g. LPNAG (SEQID NO: 99). Some embodiments include an?A? Some embodiments of the motif include an?A? instead of a?G. at position 5, e.g. LPXTA (SEQID NO: 100), e.g. LPNTA (SEQID NO: 101). Some embodiments include a?G?” Some embodiments of the motif include a?G? rather than a?P? at position 2, e.g. LGXTG (SEQID NO: 102) or LGATG (SEQID NO: 101). Some embodiments include an?I? Some embodiments include an?I? instead of?L? at position 1, e.g. IPXTG (SEQID NO: 104), or IPNTG (105) or IPETG (106)

“It will be appreciated, that the terms’recognition motif? and?recognition sequence? are interchangeable. When referring to sequences recognized or mediated by transamidases, the terms?recognition motif? and?recognition sequencing? are interchangeable. The term “transamidase recognition series” is sometimes abbreviated?TRS? Sometimes abbreviated as?TRS’ herein.”

“In some embodiments, the sortase (SrtB) is a sortase B, e.g., one of S. aureus or B. anthracis or L. monocytogenes. Motifs recognized by sortases of the B class (SrtB) often fall within the consensus sequences NPXTX (SEQ ID NO: 154), e.g., NP[Q/K]-[T/s]-[N/G/s] (SEQ ID NO: 107), such as NPQTN (SEQ ID NO: 108) or NPKTG (SEQ ID NO: 109). Sortase B of S. anthracis or B. aureus cleaves the NPQTN or NPKTG (SEQID NO: 110), IsdC motif in the respective bacteria (see Marraffini L. and Schneewind O., Journal of Bacteriology 189 (17), pp. 6425-6436 2007, 2007). NSKTA (SEQID NO: 112) and NPQTG, (SEQID NO: 113), NAKTN(SEQID NO: 114), NPQSS [SEQID NO: 115] are other recognition motifs found in class B sortases. SrtB (SEQID NO: 112) recognizes certain motifs that lack P at position 2, and/or Q or K at place 3. (Mariscotti F, Garcia-Del Portillo FO, Pucciarelli G.) The sortase-B of Listeria monocytogenes recognizes various amino acids at position 2. J Biol Chem. 2009 Jan. 7. 2009 Jan. 7.

“In certain embodiments, the sortase may be a class C sortase. As a recognition motif, class C sortases can use LPXTG (SEQID NO: 2)

“In certain embodiments, the sortase can be classified as a class D sortase. This class of sortases is predicted to recognize motifs with a consensus string NA-[E/A/S/H]?TG (SEQ ID No: 118). (Comfort D supra). There have been class D sortases found in Streptomyces, Corynebacterium, Tropheryma wiplei, Tropheryma fusca and Bifidobacterium langhum. LPXTA or LAXTG may be used as recognition sequences for class D sortsases. This is e.g. for subfamilies 4 (SEQID NO: 100), and 5 (SEQID NO: 120). These enzymes, which process the motifs LPXTA and LAXTG, respectively, are processed by subfamilies-4 and 5. B. anthracis SortaseC, which is a type D sortase has been shown to specifically cleave LPNTA (SEQID NO: 123) in B. anthracis BasI (Marrafini) (supra).

“For a description of a sortase that recognizes QVPTGV [SEQ ID NO 124) motif, see Barnett, Scott (Barnett T C, Scott, J R. Differential Recognition Surface Proteins in Streptococcus Pyogenes by two Sortase Gene homologs. Journal of Bacteriology Vol. 184, No. 8, p. 2181-2191, 2002).”

“The invention allows for the use of sortases found within any gram-positive organism, such that those listed herein and/or in references (including databases) cited therein. The invention contemplates the use of sortases that are found in gram-negative bacteria, such as Colwellia psychorerythraea and Microbulbifer dergradans, Bradyrhizobium japonicum and Shewanella oneidensis. They recognize sequence motifs such as LP[Q/K]T[A/S]T. (SEQ ID NO. 121). A sequence motif LPXT[A/S] may be used to match the tolerance for variation at position 3 in sortases of gram positive organisms.

“The invention contemplates use of sortase recognition motifs from any of the experimentally verified or putative sortase substrates listed at bamics3.cmbi.kun.nl/jos/sortase_substrates/help.html, the contents of which are incorporated herein by reference, and/or in any of the above-mentioned references. LPKTG is the sortase recognition symbol in some embodiments. The sequence may include LPXT, SEQID NO. 144, LAXT(SEQID NO. 146), LPXA [SEQID NO. 155], LPDTA (SEQID NO. 72], SPKTG (SEQID NO. 77), LAATG (SEQID NO. 76), LAATG (SEQID NO. 78), LAHTG (SEQID NO. 79), LPETG (SEQID NO. 81), 93), NAKT (SEQID NO. 94), 97), NPQSEQID NO. 95) or NPQSEQID NO. 70), LPIT In some embodiments,?X is used. Any standard or nonstandard amino acid can be used in any typease recognition motif. Each variation is disclosed. In some embodiments, X can be selected from among the 20 most common amino acids in proteins found in living organisms. Some embodiments, such as those where the recognition motif for LPXTG is (SEQID NO: 2) or (SEQID NO: 144), X can be D, E A, N Q, K or R. In other embodiments, X is chosen from amino acids that naturally occur at position 3 in a sortase substrate. In some embodiments, X can be selected from K, E N, Q, and A in an LPXTG motif (SEQID NO: 2) or the LPXT motif (SEQID NO: 144) where the sortase type is A. Some embodiments select X from K, S. E, L. A. N in an LPXTG or LPXT motif (SEQID NO: 144) and use a class-C sortase.

“Some embodiments of a recognition sequence include one or more additional amino acid, e.g. at the N-terminus or C terminus. One or more amino acids, such as up to five amino acids, may be included in recognition sequences. These amino acids can have the identity of amino acid found immediately N-terminal or C-terminal (or both) to a 5-amino acid recognition sequence in a naturally occurring sortase substrate. These additional amino acids can provide context which may improve the recognition motif.

“Transamidase recognition sequence” is a term that may be used. It could refer to either a masked sequence or an unmasked sequence of transamidase recognition. Transamidase can recognize an unmasked transamidase sequence. A transamidase that recognizes an unmasked sequence of transamidase may have been previously mask, as discussed herein. A?masked transamidase sequence may be used in some embodiments. A sequence that cannot be recognized by a transamidase, but can be easily modified (?unmasked?) The sequence is then recognized by a transamidase. In some embodiments, at least one of the amino acids in a masked transamidase sequence recognition sequence contains a side chain that includes a moiety that inhibits (e.g. substantially prevents) recognition of the sequence of interest by a transamidase. The moiety is removed and the transamidase can recognize the sequence. In certain embodiments, masking can reduce recognition by as much as 80%, 90% or 95% (or more, depending on the specific embodiment). One example is that certain embodiments a threonine in a transamidase sequence such as LPXTG (SEQID NO: 2) has been phosphorylated. This renders it refractory for recognition and cleavage from SrtA. You can remove the masked recognition sequence by treating it with a phosphatase. This will allow it to be used in a SrtA catalyzed transamidation reaction.

“Modified Proteins Composed of Click Chemistry Handles”

“Some embodiments provide a modified protein (PRT) comprising a C-terminal click chemistry handle (CCH), wherein the modified protein comprises a structure according to Formula (I):\nPRT-LPXT-[Xaa]y-CCH??(I). (SEQ ID No: 158)

“Some embodiments provide a modified protein (PRT) comprising an N-terminal click chemistry handle (CCH), wherein the modified protein comprises a structure according to Formula (I) according to Formula (II):\nCHH-[Xaa]y-LPXT-PRT??(II). (SEQ ID No: 159)

Click chemistry reactions allow for covalently conjugating two proteins that have a click handle (e.g., one protein which has a click handle providing a nucleophilic group (Nu) and another protein with an electrophilic group (E) that reacts with the Nu group of a first click handle). Sharpless introduced click chemistry in 2001. It describes chemistry that is designed to create substances quickly and reliably using small units. (See, e.g. Kolb, Finn, and Sharpless Angewandte Chemistry International Edition (2001), 40:2004-2021; Evans. Australian Journal of Chemistry (2006) 60: 384-395). Joerg Lhann, Click Chemistry for Biotechnology and Materials Science 2009 John Wiley & Sons Ltd ISBN 978-0-470-69970-6 contains additional examples of click chemistry and reaction conditions and related methods that are useful according to this invention.

Click chemistry should be modular and broad in scope. It should also generate high chemical yields, inoffensive side products, and be physiologically stable. This concept has been applied to several reactions:

“(1) The Huisgen 1,3 polar cycloaddition (e.g. the Cu(I-catalyzed-stepwise variant), often referred to as the ‘click reaction? See Tornoe and al., Journal of Organic Chemistry (2002 67: 3057-3064). The most commonly used catalysts for the reaction are copper and ruthenium. Copper is used as a catalyst to form the 1,4-regioisomer, while ruthenium forms the 1,5-regioisomer.

“(2) Other cycloaddition reaction, such as Diels-Alder reaction

“(3) Nucleophilic Addition to Small Strapped Rings like Epoxides or Ziridines”

“(4) Nucleophilic Addition to Activated Carbonyl Groups; and

“(4) Addition reactions of carbon-carbon double and triple bonds.”

“Conjugation Of Proteins Via Click Chemistry Tools”

Click chemistry can be used to conjugate two proteins. The click chemistry hands of the proteins must be reactive with one another. For example, the reactive moiety in one click chemistry handle can react with the reactive moiety in the other click chemistry handle to form an ionic covalent bond. These reactive pairs of click chemical handles are well-known to those skilled in the art. They include but aren’t limited to those mentioned in ”

“TABLE I\nTABLE I: Exemplary click chemistry handles and reactions, wherein each ocurrence of R1, R2, is independently\nPRT-LPXT-[Xaa]y- (SEQ ID NO: 158), or -[Xaa]y-LPXT-PRT (SEQ ID NO: 159)< according to Formulas (I) and (II).\n1,3-dipolar cycloaddition\nStrain-promoted cycloaddition\nDiels-Alder reaction\nThiol-ene reaction"

“In some preferred embodiments click chemistry handles can be used that react to form covalent bond in the absence of any metal catalyst. These click chemistry handle are well-known to those skilled in the art. They include the click Chemistry beyond Metal Catalyzed Cycloaddition, Angewandte Chemie International Edition (2008) 48: 4900-4988.

“Another click chemistry handle that can be used in the methods for protein conjugation described herein is well-known to those skilled in the art. These click chemistry handles include the click chemistry reaction partners and groups and click chemistry hands described in [1] H. C. Kolb and M. G. Finn as well as K. B. Sharpless and Angew. Chem. 2001, 113, 2056-2075; Angew. Chem. Int. Ed. 2001, 40, 2004-2021. [2] a] C. J. Hawker and K. L. Wooley in Science 2005, 309-1205; b] D. Fournier, R. Hoogenboom and U. S. Schubert in Chem. Soc. Rev. 2007, 36, 1369-1380; c) W. H. Binder, R. Sachsenhofer, Macromol. Rapid Commun. 2007, 28, 15-54; d) H. C. Kolb, K. B. Sharpless, Drug Discovery Today 2003, 8, 1128-1137; e) V. D. Bock, H. Hiemstra, J. H. van Maarseveen, Eur. J. Org. Chem. 2006, 51-68. [3] a) V. O. Rodionov, V. V. Fokin, M. G. Finn, Angew. Chem. 2005, 117, 2250-2255; Angew. Chem. Int. Ed. 2005, 44, 2210-2215; b) P. L. Golas, N. V. Tsarevsky, B. S. Sumerlin, K. Matyjaszewski, Macromolecules 2006, 39, 6451-6457; c) C. N. Urbani, C. A. Bell, M. R. Whittaker, M. J. Monteiro, Macromolecules 2008, 41, 1057-1060; d) S. Chassaing, A. S. S. Sido, A. Alix, M. Kumarraja, P. Pale, J. Sommer, Chem. Eur. J. 2008, 14, 6713-6721; e) B. C. Boren, S. Narayan, L. K. Rasmussen, L. Zhang, H. Zhao, Z. Lin, G. Jia, V. V. Fokin, J. Am. Chem. Soc. 2008, 130, 8923-8930; f) B. Saba, S. Sharma, D. Sawant, B. Kundu, Synlett 2007, 1591-1594. [4] J. F. Lutz, Angew. Chem. 2008, 120, 2212-2214; Angew. Chem. Int. Ed. 2008, 47, 2182-2184. [5] a) Q. Wang, T. R. Chan, R. Hilgraf, V. V. Fokin, K. B. Sharpless, M. G. Finn, J. Am. Chem. Soc. 2003, 125, 3192-3193; b) J. Gierlich, G. A. Burley, P. M. E. Gramlich, D. M. Hammond, T. Carell, Org. Lett. 2006, 8, 3639-3642. [6] a) J. M. Baskin, J. A. Prescher, S. T. Laughlin, N. J. Agard, P. V. Chang, I. A. Miller, A. Lo, J. A. Codelli, C. R. Bertozzi, Proc. Natl. Acad. Sci. Sci. A. Johnson, J. M. Baskin, C. R. Bertozzi, J. F. Koberstein, N. J. Turro, Chem. Commun. 2008, 3064-3066; d) J. A. Codelli, J. M. Baskin, N. J. Agard, C. R. Bertozzi, J. Am. Chem. Soc. 2008, 130, 11486-11493; e) E. M. Sletten, C. R. Bertozzi, Org. Lett. 2008, 10, 3097-3099; f) J. M. Baskin, C. R. Bertozzi, QSAR Comb. Sci. 2007, 26, 1211-1219. [7] a) G. Wittig, A. Krebs, Chem. Ber. Recl. 1961, 94, 3260-3275; b) A. T. Blomquist, L. H. Liu, J. Am. Chem. Soc. 1953, 75, 2153-2154. [8] D. H. Ess, G. O. Jones, K. N. Houk, Org. Lett. 2008, 10, 1633-1636. [9] W. D. Sharpless, P. Wu, T. V. Hansen, J. G. Lindberg, J. Chem. Educ. 2005, 82, 1833-1836. [10] Y. Zou, J. Yin, Bioorg. Med. Chem. Lett. 2008, 18, 5664-5667. [11] X. Ning, J. Guo, M. A. Wolfert, G. J. Boons, Angew. Chem. 2008, 120, 2285-2287; Angew. Chem. Int. Ed. 2008, 47, 2253-2255. [12] S. Sawoo, P. Dutta, A. Chakraborty, R. Mukhopadhyay, O. Bouloussa, A. Sarkar, Chem. Commun. 2008, 5957-5959. [13] a] Z. Li, T. S. Seo and J. Ju Tetrahedron. 2004, 45, 3143-3146; b) S. S. van Berkel, A. J. Dirkes, M. F. Debets, F. L. van Delft, J. J. L. Cornelissen, R. J. M. Nolte, F. P. J. Rutjes, ChemBioChem 2007, 8, 1504-1508; c) S. S. van Berkel, A. J. Dirks, S. A. Meeuwissen, D. L. L. Pingen, O. C. Boerman, P. Laverman, F. L. van Delft, J. J. L. Cornelissen, F. P. J. Rutjes, ChemBio-Chem 2008, 9, 1805-1815. [14] F. Shi, J. P. Waldo, Y. Chen, R. C. Larock, Org. Lett. 2008, 10, 2409-2412. [15] L. Campbell-Verduyn, P. H. Elsinga, L. Mirfeizi, R. A. Dierckx, B. L. Feringa, Org. Biomol. Chem. 2008, 6, 3461-3463. [16] a. The Chemistry of the Thiol Group, Ed. : S. Patai, Wiley, New York 1974; b. A. F. Jacobine in Radiation Curing Polymer Science Technology III (Eds. : J. D. Fouassier, J. F. Rabek), Elsevier, London, 1993, Chap. 7, pp. 219-268. [17] C. E. Hoyle, T. Y. Lee, T. Roper, J. Polym. Sci. Part A 2008 (42, 5338-5338). [18] L. M. Campos, K. L. Killops, R. Sakai, J. M. J. Paulusse, D. Damiron, E. Drockenmuller, B. W. Messmore, C. J. Hawker, Macromolecules 2008, 41, 7063-7070. [19] a) R. L. A. David, J. A. Kornfield, Macromolecules 2008, 41, 1151-1161; b) C. Nilsson, N. Simpson, M. Malkoch, M. Johansson, E. Malmstrom, J. Polym. Sci. Part A 2008, 46, 1339-1348; c) A. Dondoni, Angew. Chem. 2008, 120, 9133-9135; Angew. Chem. Int. Ed. 2008, 47, 8995-8997; d) J. F. Lutz, H. Schlaad, Polymer 2008, 49, 817-824. [20] A. Gress, A. Voelkel, H. Schlaad, Macromolecules 2007, 40, 7928-7933. [21] N. ten Brummelhuis, C. Diehl, H. Schlaad, Macromolecules 2008, 41, 9946-9947. [22] K. L. Killops, L. M. Campos, C. J. Hawker, J. Am. Chem. Soc. 2008, 130, 5062-5064. [23] J. W. Chan, B. Yu, C. E. Hoyle, A. B. Lowe, Chem. Commun. 2008, 4959-4961. [24] a) G. Moad, E. Rizzardo, S. H. Thang, Acc. Chem. Res. 2008, 41, 1133-1142; b) C. Barner-Kowollik, M. Buback, B. Charleux, M. L. Coote, M. Drache, T. Fukuda, A. Goto, B. Klumperman, A. B. Lowe, J. B. McLeary, G. Moad, M. J. Monterio, R. D. Sanderson, M. P. Tonge, P. Vana, J. Polym. Sci. Part A 2006, 44. 5809-5831. [25] a) R. J. Pounder, M. J. Stanford, P. Brooks, S. P. Richards, A. P. Dove, Chem. Commun. 2008, 5158-5160; b) M. J. Stanford, A. P. Dove, Macromolecules 2009, 42, 141-147. [26] M. Li, P. De, S. R. Gondi, B. S. Sumerlin, J. Polym. Sci. Part A 2008, 46. 5093-5100. [27] Z. J. Witczak, D. Lorchak, N. Nguyen, Carbohydr. Res. 2007, 342, 1929-1933. [28] a) D. Samaroo, M. Vinodu, X. Chen, C. M. Drain, J. Comb. Chem. 2007, 9, 998-1011; b) X. Chen, D. A. Foster, C. M. Drain. Biochemistry 2004, 43. 10918-10929. c) D. Samaroo and C. E. Soll. L. J. Todaro. C. M. Drain. Org. Lett. 2006, 8, 4985-4988. [29] P. Battioni. O. Brigaud. H. Desvaux. D. Mansuy. T. G. Traylor. Tetrahedron Lett. 1991, 32, 2893-2896. [30] C. Ott, R. Hoogenboom, U. S. Schubert, Chem. Commun. 2008, 3516-3518. [31] a) V. Ladmiral, G. Mantovani, G. J. Clarkson, S. Cauet, J. L. Irwin, D. M. Haddleton, J. Am. Chem. Soc. 2006, 128, 4823-4830; b) S. G. Spain, M. I. Gibson, N. R. Cameron, J. Polym. Sci. Sci. Part A 2007, 45. 2059-2072. [32] C. R. Becer, K. Babiuch, K. Pilz, S. Hornig, T. Heinze, M. Gottschaldt, U. S. Schubert, Macromolecules 2009, 42, 2387-2394. The reaction was first described by Kurt Alder and Otto Paul Hermann Diels in 1928. Their work on the eponymous reaction earned them the Nobel Prize in Chemistry in Chemistry in 1950. [34] a) H. L. Holmes, R. M. Husband, C. C. Lee, P. Kawulka, J. Am. Chem. Soc. 1948, 70, 141-142; b) M. Lautens, W. Klute, W. Tam, Chem. Rev. 1996, 96, 49-92; c) K. C. Nicolaou, S. A. Snyder, T. Montagnon, G. Vassilikogiannakis, Angew. Chem. 2002, 114, 1742-1773; Angew. Chem. Int. Ed. 2002, 41, 1668-1698; d) E. J. Corey, Angew. Chem. 2002, 114, 1724-1741; Angew. Chem. Int. Ed. 2002, 41, 1650-1667. [35] a. H. Durmaz. O. Altintas. T. Erdogan. G. Hizal. U. Tunca. Macromolecules 2007. 40, 191-198. b. H. Durmaz. A. Dag. A. Hizal. G. Hizal. U. Tunca. J. Polym. Sci. Part A 2008, 46; c), A. Dag and H. Durmaz. c) E. Demir. G. Hizal. U. Tunca. J. Polym. Sci. Part A 2008 (46, 6969-6977); d) B. Gacal and H. Akat; D. K. Balta; N. Arsu; Y. Yagci; Macromolecules 2008 (41) 2401-2405; e). A. Dag. H. Durmaz. U. Tunca. G. Hizal., J. Polym. Sci. Sci. [36] M. L. Blackman, M. Royzen, J. M. Fox, J. Am. Chem. Soc. 2008, 130, 13518-13519. [37] It is important to note that trans-cyclooctene, which is seven orders of magnitude more reactive toward tetrazines than ciscyclooctene, is the most reactive. [38] N. K. Devaraj, R. Weissleder, S. A. Hilderbrand, Bioconjugate Chem. 2008, 19, 2297-2299. [39] W. Song, Y. Wang, J. Qu, Q. Lin, J. Am. Chem. Soc. 2008, 130, 9654-9655. [40] W. Song, Y. Wang, J. Qu, M. M. Madden, Q. Lin, Angew. Chem. 2008, 120, 2874-2877; Angew. Chem. Int. Ed. 2008, 47, 2832-2835. [41] A. Dag, H. Durmaz, G. Hizal, U. Tunca, J. Polym. Sci. Part A 2008, 46. 302-313. [42] a) A. J. Inglis, S. Sinnwell, T. P. Davis, C. Barner-Kowollik, M. H. Stenzel, Macromolecules 2008, 41, 4120-4126; b) S. Sinnwell, A. J. Inglis, T. P. Davis, M. H. Stenzel, C. Barner-Kowollik, Chem. Commun. 2008, 2052-2054. [43] A. J. Inglis, S. Sinwell, M. H. Stenzel, C. Barner-Kowollik, Angew. Chem. 2009, 121, 2447-2450; Angew. Chem. Int. Ed. 2009, 48, 2411-2414. All above-cited references are included herein for the disclosure of click chemical handles that can be installed on proteins using inventive concepts and methods.

“For example, some embodiments provide a first protein with a Cterminal strained alkyne, such as a Cterminal cyclooctyne, as click chemistry handles, and a second protein with a click chemistry handle that has a Cterminal azide. Both click chemistry handles can react with one another, and they can perform a strain-promoted cycleoaddition which results in the first protein and second protein being bound via a covalent bond. The two C-termini (or C to C) of the proteins are conjugated together in this example.

“In some embodiments, a first molecular, such as a protein, which contains a nucleophilic click chemical handle (Nu), selected from??SH,?OH,?NHRb5,?NH?NHRb5 or?N?NH is conjugated with a second molecular, for instance, a protein that includes the electrophilic partner click chemical handle (E).”

“To form a chimeric prot with a conjugated formula group:”

“Zb9 is?S,??,???,???N(Rb5),??NH?N (Rb5), or?”N??. In some embodiments the nucleophilic click chemical handle Nu is?SH while Zb9 is?S?. In some embodiments Nu is??OH and Zb9?O?. In some embodiments Nu is??NHRb5 while Zb9 is?N(Rb5)?. In some embodiments Nu is??NH?NHRb5 while Zb9 is?NH?N (Rb5)?. In some embodiments Nu is??NH and Zb9?N??. Rb5 can be hydrogen in certain embodiments.

“In some embodiments, Nu may be?SH?OH?NHRb5,?NH?NHRb5 or?N??NH.

“And the two molecules, such as two proteins, are combined to form a Chimeric Molecule, for instance a protein with Nu and E joining to form a conjugated Group of the formula:

Summary for “Using sortases for click chemistry handles to protein ligation”

“Protein engineering has become a popular tool in many areas. Controlled protein ligation is one engineering technique. Controlled protein ligation is a method that uses efficient preparation of synthetic amino acids. This can be difficult for many proteins. Recombinant technology can be used to create protein-protein fusions. This involves joining the C-terminus and N-termini of two proteins. To join proteins, you can also use intein-based protein ligation methods. This intein-mediated method of ligation requires that the target protein be expressed in a properly folded fusion with the intertinein. Protein ligation is severely limited by the limitations of both recombinant and native ligation technologies.

“The transpeptidation reaction that is catalyzed in sortases has emerged to be a common method of derivatizing proteins with different types of modifications. Target proteins with conventional sortase modifications are designed to have a sortase recognition pattern (LPXT (SEQID NO: 144)) at their C-termini. These artificial sortase substrates are incubated with synthetic aminopeptides that contain one or more Nterminal glycine residues, and a recombinant kindase. This results in a transacylation reaction, which causes the residues C-terminal of the threonine to be exchanged with the synthetic Oligoglycine Peptide. The protein C-terminus is ligated to its N-terminus.

“Some aspects relate to sortase mediated modification of proteins, in specific on the installation reactive chemical groups, such as click chemistry handles on protein sequences. Methods and reagents are available for installing reactive chemical groups on proteins. Modified proteins include proteins with a click chemistry handle at the C-terminal or N-terminal. Methods to combine two modified proteins according to this invention are also provided. These methods can be used to dimerize monomeric protein chains and to create chimeric proteins which combine the properties of heterologous single proteins (e.g. chimeric, bispecific antibodies).

“Some aspects provide compositions and reagents that allow for the C-terminal or N-terminal additions of click chemistry handle to proteins via a sortase Transacylation reaction. Some aspects of the invention offer methods to install a click chemistry hand at or proximal the C-terminus for a protein having a sortase recognition pattern (e.g., LPXT [SEQ ID NO. 144)) near its C-terminus. This invention provides methods to install a click chemical handle at the N-terminus a protein that contains one or more Nterminal glycine residues.

“Some embodiments, for example, provide a method to conjugate a target protein with a C-terminal click chemical handle. Some embodiments include providing the target protein a Cterminal sortase Recognition motif (e.g. LPXT (SEQID NO: 144)); or as a Cterminal fusion. The method may also include contacting the target protein using an agent such as a peptide or protein or a compound comprising one to ten N-terminal glycocine residues or an alkylamine group. Some embodiments allow for contact to be made in the presence a sortase enzyme, under conditions that permit the sortase and the click chemistry hand to transamidate target proteins. This conjugates the target protein to click-chemistry handle.

“Some embodiments offer a method for conjugating a target proteins to an N-terminal click chemical handle. The method may include providing the target protein 1-10 Nterminal glycine residues and/or an Nterminal alkylamine groups, such as for an N-terminal fuse. The method may also include contacting the target protein using a peptide that has a sortase recognition pattern (e.g., SEQ ID NO: 144)) and the click-chemistry handle. In some embodiments, the contact is done in the presence a sortase enzyme, under conditions that allow the sortase and peptide to transamidate target proteins. This conjugates the target protein to click-chemistry handle.

“Any chemical moiety is possible to be added to a protein by using the methods described in this invention. Click chemistry handles are of particular importance according to certain aspects of the invention. Click chemistry handles refer to chemical moieties that provide a reactive element that can participate in click chemistry reactions. Click chemistry reactions, as well as suitable chemical groups, are well-known to those skilled in the art. These include terminal alkynes and azides, strainedalkynes and dienes, alkoxyamines and carbonyls. In some instances, an alkyne and an azide can be used in click chemistry reactions.

“Some aspects provide modified proteins. For example, proteins with a click chemistry handle at the C-terminal or N-terminal. These proteins can be used to conjugate with other molecules such as proteins, nucleic acid, polymers, lipids or small molecules. Some embodiments include an antigen binding domain. This could be an antigen domain for an antibody, such as a camelid, single-domain, VHH domain, nanobody, ScFv or antigen-binding fragment.

“Some aspects provide click chemistry methods that allow for the conjugation or ligation of two protein molecules. One embodiment installs a first click chemical handle on the protein and then a second click chemicals handle on the second protein. The first click handle can form a covalent link with the second click. Some embodiments offer a way to post-translationally combine two proteins to create a chimeric prot. The method may include contacting a first protein that has been conjugated to first click-chemistry with a second protein that has been conjugated to second click chemistry under conditions that allow the first click chemical handle to react with second click click handle. This creates a chimeric proteins consisting of the two proteins linked by a covalent bond.

“The methods described herein enable the generation of Nterminus-to-N-terminus and C-terminus-to-C-terminus protein conjugation. This is not possible by recombinant methods (e.g. expression of protein fusions). For example, in some embodiments, the first click chemistry handle is conjugated to the N-terminus of the first protein, and the second click chemistry handle is conjugated to the N-terminus of the second protein, and the chimeric protein is an N-terminus-to-N-terminus conjugation of the two proteins. In other embodiments, the first click chemistry handle is conjugated to the C-terminus of the first protein and the second click chemistry handle is conjugated to the C-terminus of the second protein, and the chimeric protein is a C-terminus-to-C-terminus conjugation of the two proteins. Click handles can be used in some embodiments to link the C- and N termini of first and second polypeptides. This is an alternative to creating a fusion protein. This is especially useful e.g. if the fusion protein is large, toxic, difficult to purify, encoded with nucleic acids sequences that are difficult to clone or avoid cloning.

“Some embodiments provide chimeric protein, such as chimeric proteins created by post-translational combination of two proteins according to aspects. Some embodiments offer chimeric, bispecific antibodies that contain two antigen binding proteins. A bispecific, chimeric antigen may include a first antigen-binding fragment or antibody that contains a sortase recognition pattern and a second antigen-binding fragment or antibody fragment which includes a sortase recognition series. The first and second antibodies or fragments are then combined via click chemistry.

“It is important to note that the invention does not limit itself to the conjugation antigen-binding protein conjugations. Any protein can be paired with any molecule that has a suitable click chemical handle. These handles can be installed using the methods described herein, or other methods that are well-known to those skilled in the art. Some embodiments include chimeric proteins that contain a target protein having a sortase recognition pattern (e.g. LPXT (SEQID NO: 144)), and another molecule which is conjugated to the protein using click chemistry. Some embodiments generate a chimeric protein by placing a click chemical handle on the target proteins and then contacting the target proteins including the click-chemistry handle with the second mole. The second molecule contains a second click handle that can react with click chemistry handles of the target proteins to form a covalent bonds.

“Some embodiments include modified proteins. For example, proteins that contain a sortase-recognition motif (e.g. LPXT (SEQID NO: 144)), and a click chemical handle conjugated with the sortase recognize motif. This can be done, for instance, by attaching to one of the typease recognitionmotifs amino acids or using a linker. Some embodiments include an antigen binding domain, such as an antibody or fragment of an antigenbinding antibody. Examples of modified proteins are provided herein include an antigen-binding domain for a camelid antibody, or a fragment thereof, a VHHdomain, a single-domain antigen, an affibody and an anticalin. The click chemistry hand is located at the C-terminus, in some embodiments. In other embodiments it is at the N-terminus. The click chemistry handle can be selected from any of the following: terminal alkyne or azide, strainedalkyne and dieneophile; carbonyl, alkoxyamine; hydrazide; thiol; and alkene.

“Some embodiments provide kits that include one or more reagents for carrying out the methods described herein. In some embodiments, the invention includes a kit that contains one or more reagents useful in carrying out the methods described herein. The kit may include a first peptide that contains 1-10 glycine atoms or a terminal allylamine conjugated with a first click chemical handle. A second peptide which comprises 1-10 glycine atoms or a terminal kylamine conjugated with a second click chemical handle is also included. In these cases, the click chemistry handles of the first and second peptides can react. The kit may include a first peptide that contains a sortase-recognition motif conjugated with a first click chemical handle and a second one that includes a sortase-recognition motif conjugated with a second click chemical handle. These peptides are capable of reacting with each others using the click chemistry handles of the first and second. The kit may also include a sortase protein. Some embodiments include instructions for using a catalyst (e.g. a metal catalyst) and/or a reaction buffer.

The above summary is meant to provide an overview of some aspects of the invention and should not be taken to limit the invention. Further aspects, advantages and embodiments of the invention are described in this disclosure. Those skilled in the art will also be able to see further embodiments based on the present disclosure. All references cited herein and elsewhere are hereby included by reference.

Standard genetic approaches permit the production of protein combinations by fusions of polypeptides in a head-to-tail manner. However, some applications may benefit from constructions that are genetically unfeasible, such as site-specific linking of proteins via their N or C-termini. This is when biological activity requires a free terminus.

“Production and purification of fusion protein remains a challenging biotechnological problem. Both domains must adopt the same native fold to produce an active product. This is true regardless of whether they have modified any residues or regions. Genetic fusion is the most common method for creating fusion proteins. This involves combining the open reading frames or fragments of two proteins. Fusion proteins are often made up of defective folding products and partially folded proteins.

This problem can be circumvented by post-translational conjugation of natively folded and purified proteins, for example, using a ligation label. These methods use labeling at either the N- or C-terminus of modified proteins to create the adducts of the interest just like if one were making the genetic fusions. These sortase-catalyzed transacylation reaction allow for such site-specific labeling and the preparation of protein-protein fusions head-to-tail under native conditions. (Popp MW, Ploegh L (2011) Making and breaking peptidide bonds: Protein engineering using Sortase. Angew Chem Int Ed 50:5024-5032; Guimaraes C P et al. (2011) Modified choleratoxin can be used to identify host cell factors that are required for intoxication. J Cell Biol, 195:751-764. Popp MW, Antos JM, Grotenbreg M, Spooner EP, Ploegh L (2007) Sortagging is a flexible method for protein labeling. Nat Chem Biol 3 :707-708 (the entire contents of each are included herein by reference).

Standard sortase-ligation methods do not permit the production of protein-protein fusions that would be genetically impossible (Nterminus to Nterminus; Cterminus to Cterminus), but such unnatural liaisons could be very useful for the creation of bispecific antibodies and their fragments. This invention focuses on the fact that chemical ligation is required to achieve such fusions. Early chemical conjugation strategies used non-specific crosslinking via aminos or sulfhydryls. Kim J S, Raines T (1995) Dibromobimane was used as a fluorescent crosslinking agent. Analytical Biochemistry, 225:174-176, whose entire contents are included herein as a reference. This approach is limited in its usefulness because it lacks control over the location and stoichiometry for modification. Bioorthogonal chemistries, site-specific mutation, native chemical ligation and amber suppressorpyrrolysine-tRNA technology have enabled the creation of non-natural protein-fusions. These can be used to produce bivalent and multivalent antibody (Schellinger JG et al. (2012) A platform for chemical synthesis that crosslinks multivalent single-chain variable fragments. Org Biomol Chem 10:1521-1526; Natarajan A et al. (2007) Construction of diScFv by a trivalent alkyneazide 1,3 polar cycloaddition. Chem Commun 695-697. Xiao, Hamilton B S, Tolbert J (2010) Synthesis by Native Chemical Ligation of N-Terminally Linked Proteins and Peptide Dimers. Bioconjug Chemistry 21:1943-1947. The entire contents are included herein as a reference. The synthesis of structural analogs of ubiquitin dimers was achieved by a combination intein-based ligation and site-specific mutation. Chem Commun 48; 296; Weikart D, Mootz HD (2010) The generation of site-specific and enzymatically stable conjugates of recombinant proteins with Ubiquitin like modifiers by the Cu I?Catalyzed Alkyne Cycloaddition. ChemBioChem 11 :774-777. The entire contents are included herein by reference. Site-specific incorporation of propargyloxyphenylalanine facilitated the synthesis of GFP dimers (Schellinger J G et al. (2012) A platform for chemical synthesis that allows crosslinking single-chain variable fragments of multivalent single chains. Org Biomol Chem 10:1521-1526; and Bundy B C, Swartz J R (2010) Site-Specific Incorporation of p-Propargyloxyphenylalanine in a Cell-Free Environment for Direct Protein?Protein Click Conjugation. Bioconjug Chemistry 21:255-263. The entire contents of each are included herein by reference.

“Nevertheless, the synthesis bispecifics would benefit from a method which is orthogonal with the published methods. This allows for easy access to modified native proteins and allows for efficient non-natural conjugation. Orthogonal methods allow for the creation of complex protein structures (e.g. heterotrimers or higher order complexes) thanks to their availability. Herein are disclosed reagents, methods, and other information that allow the conjugation of proteins at the N- or Cterminus to other entities. This includes, but is not limited to, other protein structures. One of the methods for conjugating proteins is to add click chemistry handles using a sortase catalyzed transpeptidation process. These modified proteins can be added to a molecule with a reactive click chemical handle.

“Some aspects relate to the recognition of the ease with which the sortase transacylation reaction permits the easy installation of any number of substitutes at the Cterminus of a suitable modified protein. A transacylation reaction that is successful requires a suitable sortase recognition motif. This could be an LPXT (SEQID NO: 144) or LPXTG motif (SEQID NO: 2) in the target protein. It is also easy to design nucleophiles for use in a sortase-catalyzed reaction. A short run (e.g. 1-10) of glycine residues or an alkylamine suffices to allow it to proceed. A sortase transacylation strategy is a simple way to modify a target proteins. It also allows for easy synthesis and execution of the reaction under physiological conditions.

Some aspects of the invention recognize that nucleophiles used in sortase reactions can be modified to include biotin, detectable label (e.g. fluorophores), fatty acid, nucleic acids and lipids as well as carbohydrates, radioisotopes, radioisotopes and proteins with a suitablely exposed N-terminal stretch glycine residues. Some aspects of the invention also allow for nucleophiles to be used in sortase reactions that contain reactive chemical moieties. These include moieties or?handles? that are suitable for click chemistry, such as a copper-free click reaction. These nucleophiles include peptides containing 1-10 glycine amino acids (e.g. GGG), and any compound (e.g. A peptide that contains an alkylamine group and a click chemical handle can be used to place a Cterminal click chemistry control on a target protein that has a Cterminal sortase identification motif. It does not need to be at the C-terminus. However, it must be accessible enough by the enzyme to allow the sortase reaction to take place.

Click chemistry handles can also be installed Nterminally on proteins with a short glycolic run, or proteins or compounds containing an alkylamine (e.g. at their Nterminus for proteins), through a sortase reaction that uses a peptide containing a sortase identification motif and the desired click-chemistry handle. According to the invention, any protein that contains either a sortase detection motif, 1-10 glycine residues or a terminal allylamine group can be derivatized using a click chemical handle. Click chemistry reactivity is conferred by the installation of a click handle on a target proteins. A click chemistry hand, such as the one described herein, allows a protein to react with another molecule. This can form a covalent bond and bind the two molecules together.

“In some cases, reactive click chemistry handle proteins are combined by performing the appropriate click chemistry reaction. The proteins are then bound to one another via a covalent link. The inventive strategies enable the installation of click chemistry handles on either the N or C-terminus. This allows two modified proteins to be combined via a covalent bond. It works much like a traditional protein fusion. Installing C-terminal reactive click chemical handles on both target protein allows for the creation of proteins that are covalently bound at their C-termini. (C?toC-termini), C?C), while N-terminal reactive click chemistry handle on both target proteins allows the generation proteins that are conjugated at their N.termini (N?termini). Conventional protein engineering technologies such as recombinant proteins fusion technology cannot achieve covalent C?C or covalent N?N conjugation.

“Sortase Mediated Installation of Click Chemistry Handles”

“Sortases and sortase-mediated transacylation reaction are well-known to those skilled in the art. The transpeptidation reaction catalyzed with sortase results, in general, in the ligation species containing a transamidase-recognition motif with those that have one or more Nterminal glycine amino acids. The sortase identification motif may be described in certain embodiments. The sortase recognition symbol can be either an LPXT (SEQID NO: 144) or an LPXTG (2:2) motif in certain embodiments. The substitution of the C terminal residue of the recognition sequence by a moiety with poor nucleophilicity after it has been released from the sortase results in a more efficient ligation.

“The sortase-transacylation reaction allows for efficient linking of an acyl donor and a nucleophilic acceptor. This principle can be used to link many acyl donors with a variety of acyl acceptors. The sortase reaction was used previously to ligate proteins and/or their peptides, link a reporting molecule with a protein/peptide, join a nucleic acids to a peptide or protein, or conjugate a protein and peptide to an a solid support or polymer. It also linked a protein to a label. These products and processes are cost-effective and time-savings in ligation product synthesis. They also make it easy to link an acyl donor and an acylacceptor.

The transamidase activity in sortase catalyzes sotase-mediated transacylation reactions. A transamidase can form a linkage (i.e. amide linkage) between an amino donor compound and an nucleophilic acceptor with a NH2CH2-moiety. The sortase in some embodiments is sortase (SrtA). It should be noted, however, that sortase A (SrtA) is the sortase. They contain peptidoglycan, polysaccharides, and/or Teichoic acid as part of the cell wall structure. The following Gram-positive bacteria are included: Actinomyces. Bacillus. Bifidobacterium. Cellulomonas. Clostridium. Corynebacterium. Micrococcus. Mycobacterium. Nocardia. Staphylococcus. Streptococcus. Streptomyces.

“Sortase Mediated Installation of CTerminal Click Chemistry Handles

“In some embodiments, a sortasemediated transacylation reaction to install a C-terminal click chemical handle on a proteins comprises a step that contacts a protein consisting of a transamidase recognition sequencing of the structure:

“wherein”

“wherein”

“”

“Those skilled in the art will understand that the click chemical handle can be integrated into B1 in any way and in any position that is possible. For example, B1 could contain an amino acid (e.g. lysine), and the click chemical handle can be attached to either the central carbon or the side chain of the amin acid or the carboxyl group. The click chemistry handle can be incorporated into B1 in many other ways, as will those skilled in the art.

It will be further understood that depending on the nature B1, the click chemical handle may be installed at or near the C-terminus. If B1 contains a first amino acids comprising the click chemical handle and several additional amino acids, then the click handle may be installed at the very C-terminus of the target protein or, e.g. It will be obvious to those skilled in the art that a similar situation exists for N-terminal installation.

“A person of average skill will be able to appreciate that in certain embodiments, a C-terminal amino acids of the transamidase Recognition Sequence are omitted. This is an acyl group.

“replaces C-terminal amino acids of the transamidase Recognition Sequence. “In some embodiments, a acyl group is”

“In some embodiments the acyl group”

“”

“In some embodiments, sortase or transamidase recognition sequence is LPXT. (SEQ ID NO. 144). In which X is a standard and non-standard amino acids. Some embodiments allow X to be selected from D, E and A. Q. K. or R. In some embodiments, X is chosen to match a naturally occurring transamidase sequence. Some embodiments use a transamidase recognition sequence from LPKT, LPIT, SEQID NO. 49 and SPKT. Some embodiments (e.g. in those where sortase B is used) include the transamidase recognition sequence X1PX2X3(SEQ ID NO. 152). In this sequence, X1 is leucine or isoleucine; X2 any amino acid; and X3 threonine or serine. P is proline, while G is glycine. As noted above, X1 is leucine while X3 is the threonine in certain embodiments. In some embodiments, X2 can be aspartate or glutamate, alanine and glutamine or methionine. Certain embodiments, such as those that use sortase B, often include the amino acid sequence (SEQ ID No: 153), in which X1 is glutamine, lysine, or glycine, X2 or asparagine, N is asparagine, P is proline, and T is the threonine. This invention recognizes that X selection may be based at most in part on desired properties of the compound containing the recognition pattern. Some embodiments of the invention allow X to alter a property of the compound containing the recognition motif. For example, it may increase or decrease the solubility in a solvent. Some embodiments select X to be compatible in determining reaction conditions for the compound containing the recognition motif. For example, X may be selected to not react with any of the reactants in the synthesis.

“In some embodiments, the X is?O?”. In some embodiments, however, X may be?NR?. In some embodiments, however, X can be?NH?. In some embodiments, the X value is?S?”

“In some embodiments, R1 can be substituted for aliphatic. R1 can be substituted as an aliphatic in certain embodiments. In some embodiments R1 can be substituted for C1-12 aliphatic. In some embodiments R1 can be substituted C1-12 aliphatic. In some embodiments R1 can be substituted for C1-6 aliphatic. In some embodiments R1 can be substituted for C1-6 aliphatic. In some embodiments R1 is C1-3-aliphatic. In some embodiments R1 can be butyl. In some embodiments R1 can be n-butyl. R1 can be isobutyl in some embodiments. R1 can also be propyl in some embodiments. In some embodiments R1 is npropyl. R1 can be isopropyl in some embodiments. R1 can also be ethyl in some embodiments. In some embodiments R1 is methyl.

“In some embodiments, R1 can be substituted for aryl. In some embodiments, R1 can be substituted with aryl. In some embodiments, R1 can be substituted with phenyl. R1 can be substituted with phenyl in certain embodiments.

“A1 may contain a protein in some embodiments. In some cases, A1 contains a peptide. A1 may contain an antibody, an affibody or anticalin, a DARPin or a peptide in some instances. A1 may also include a D-amino acid recombinant protein or a protein that contains one or more D amino acids. A1 may be an amino acid sequence that contains at least three amino acids. In some embodiments, A1 includes a protein. A1 may also contain a peptide in some instances. A1 may also contain an antibody in some embodiments. A1 may also contain an antibody fragment. In some embodiments, A1 contains an antibody epitope. A1 may also contain green fluorescent protein. Some embodiments include ubiquitin.

“In some embodiments B1 includes a click-chemistry handle. B1 may include the click chemistry handle described in this article. In some embodiments, B1 includes a click-chemistry handle as described in Table 1, Table 2, or FIG. 2B. 2B. In certain embodiments, B1 may include a terminal alkyne or azide, strainedalkyne and dieneophile as well as an alkoxyamine. In some embodiments, B1 includes a click chemical handle as described in Table 1, Table 2, or FIG. 2B.”

“In some embodiments, n can be a number between 0 and 50 inclusive. In some embodiments, n can be a number between 0 and 20 inclusive. In some embodiments, n equals 0. In some embodiments, n equals 1. In some embodiments, n equals 2. In some embodiments, n equals 2. In some embodiments, n equals 4. In some embodiments, n equals 5. In some embodiments, n equals 5.

“Sortase Mediated Installation of N-Terminal Chemistry Handles”

“In some embodiments, a sortasemediated transacylation reaction to install an N-terminal click chemical handle on a proteins comprises a step that contacts a protein of structure:

“wherein”

“wherein”

“Those skilled in the art will understand that click chemistry handles can be integrated into A1 in any way and in any position that is possible. A1 could contain an amino acid (e.g., Lysine), and the click chemical handle can be attached to either the central carbon or the side chain or the amino group. The click chemistry handle can be incorporated into A1 in many other ways, as will those skilled in the art.

“A person of average skill will be able to appreciate that in certain embodiments, a C-terminal amino acids of the transamidase Recognition Sequence are omitted. This is an acyl group.

“replaces C-terminal amino acids of the transamidase Recognition Sequence. “In some embodiments, a acyl group is”

“In some embodiments the acyl group”

“”

“In some embodiments, sortase or transamidase recognition sequence is LPXT. (SEQ ID NO. 144). In which X is a standard and non-standard amino acids. Some embodiments allow X to be selected from D, E and A. Q. K. or R. In some embodiments, X is chosen to match a naturally occurring transamidase sequence. Some embodiments use a transamidase recognition sequence from LPKT (SEQID NO. 48), LPIT [SEQID NO. 49], LPDT [SEQID NO. 50], LPDT [SEQID NO. 52], LAAT (SEQID NO. 53], LAAT (SEQID NO. 54], LAET(SEQID NO. 55), LAET(SEQID NO. 56), LPLT (?SEQID NO. 65?), LPQT (?SEQID NO. 69?), NAKT (SEQID NO. 69?) and NPQSEQID NO.70) Some embodiments (e.g. those in which sortase B is used) include the transamidase recognition sequence X1PX2X3 (SEQ ID NOT: 152) in which X1 is leucine or isoleucine; X2 any amino acid; and X3 threonine or serine. P is proline, while G is glycine. As noted above, X1 is leucine while X3 is the threonine in certain embodiments. In some embodiments, X2 can be aspartate or glutamate, glutamate and alanine, or glutamine. Certain embodiments, such as those that use sortase B, often include the amino acid sequence (SEQ ID No: 153), in which X1 is glutamine, lysine, or glycine, X2 or asparagine, N is asparagine, P is proline, and T is the threonine. This invention recognizes that X selection may be based at most in part on desired properties of the compound containing the recognition pattern. Some embodiments of the invention allow X to alter a property of the compound containing the recognition motif. For example, it may increase or decrease the solubility in a solvent. Some embodiments select X to be compatible in determining reaction conditions for the compound containing the recognition motif. For example, X may be selected to not react with any of the reactants in the synthesis.

“In some embodiments, the X is?O?”. In some embodiments, however, X may be?NR?. In some embodiments, however, X can be?NH?. In some embodiments, the X value is?S?”

“In some embodiments, R1 can be substituted for aliphatic. R1 can be substituted as an aliphatic in certain embodiments. In some embodiments R1 can be substituted for C1-12 aliphatic. In some embodiments R1 can be substituted C1-12 aliphatic. In some embodiments R1 can be substituted for C1-6 aliphatic. In some embodiments R1 can be substituted for C1-6 aliphatic. In some embodiments R1 is C1-3-aliphatic. In some embodiments R1 can be butyl. In some embodiments R1 can be n-butyl. R1 can be isobutyl in some embodiments. R1 can also be propyl in some embodiments. In some embodiments R1 is npropyl. R1 can be isopropyl in some embodiments. R1 can also be ethyl in some embodiments. In some embodiments R1 is methyl.

“In some embodiments, R1 can be substituted for aryl. In some embodiments, R1 can be substituted with aryl. In some embodiments, R1 can be substituted with phenyl. R1 can be substituted with phenyl in certain embodiments.

“In some embodiments, B1 contains a protein. In some cases, B1 contains a peptide. Some embodiments of B1 include an antibody, an antibodies chain, an antibody fragment or an epitope, and an antigen-binding domain. In some embodiments, B1 includes a D-amino acid recombinant protein. This protein may include an antibody, an antibody chain, an antibody fragment, an antigen-binding protein domain, and an enzyme. B1 may be an amino acid sequence that contains at least three amino acids. In some embodiments, B1 is a protein. B1 may also contain a peptide in some embodiments. In some embodiments, B1 contains an antibody. In some embodiments, B1 contains an antibody fragment. B1 may also contain an antibody epitope in some embodiments. In some embodiments, B1 contains green fluorescent protein. Some embodiments include ubiquitin.

“In some embodiments A1 includes a click-chemistry handle. In some embodiments, A1 includes the click chemistry handle described in this article. A1 may contain a click-chemistry handle as described in Table 1, Table 2 or FIG. 2B. 2B. In certain embodiments, A1 may include a terminal alkyne or azide, strainedalkyne and dieneophile as well as an alkoxyamine. A1 may include a click chemical handle as described in Table 1 and Table 2 or FIG. 2B.”

“In some embodiments, n can be a number between 0 and 50 inclusive. In some embodiments, n can be a number between 0 and 20 inclusive. In some embodiments, n equals 0. In some embodiments, n equals 1. In some embodiments, n equals 2. In some embodiments, n equals 2. In some embodiments, n equals 4. In some embodiments, n equals 5. In some embodiments, n equals 5.

“Suitable Enzymes, Recognition Motifs”

“Transamidase can be a sortase in certain embodiments. Enzymes that are’sortases’ Enzymes identified as?sortases are enzymes that are cleaved and translocated proteins from Gram-positive bacteria to intact cell walls. Sortase A (Srt) and sortase (SrtB) are two of the sortases isolated from Staphylococcus Aureus. In certain embodiments, the transamidase used according to the present invention is a kindase A, e.g. from S. aureus. A transamidase can be a sortase from S. aureus.”

Based on sequence alignment and phylogenetic analyses of 61 sortases taken from Gram-positive bacteria genomes (Dramsi, Trieu Cuot P, Bierne, Sorting sortases, a nomenclature proposal to describe the different sortases of Grampositive bacteria), “Sortases were divided into four classes. Res Microbiol. 156(3):289-97, 2005. These classes correspond with the following subfamilies into which sortases were also classified by Comfort and Clubb (Comfort T, Clubb R). A comparative genome analysis has identified distinct sorting pathways within gram-positive bacteria. Infect Immun. 72(5):2710-22 2004: Class A (Subfamily 1) Class B (Subfamily 2), Class C (Subfamily 3), and Class D (Subfamilies 5 and 4). Numerous recognition motifs and sortases are revealed in the references. Also see Pallen M. J., Lam A. C., Antonio, M., and Dunbar K. TRENDS In Microbiology 2001, 9(3): 97-101. Anyone skilled in the art can easily assign a sortase the correct class based upon its sequence and/or any other characteristics, such as those described by Drami et. al. supra. The term “sortase A” is used herein. The term?sortase A? is used here to denote a class-A sortase. It is usually called SrtA in any specific bacterial species, e.g. SrtA from S. aureus. Also,?sortase A? It is also used to denote a class B typease, often called SrtB in particular bacterial species. The invention includes embodiments that relate to sortase B from any bacterial strain or species. The invention includes embodiments that relate to a sortase A from any bacterial strain or species. The invention includes embodiments that relate to a class B sortase from any strain or species of bacteria. The invention includes embodiments that relate to a class D typease from any bacterial strain or species.

“The amino acid sequences of SrtA and SrtB and the nucleotide sequences which encode them are well-known to those skilled in the art. They are disclosed in a number references cited herein. The entire contents of each of these references are included herein by reference. S. aureus SrtA’s and SrtB’s amino acid sequences are homologous. They share, for example, 22% of their sequence identity and 37% of their sequence similarity. A sortase transamidase sequence from Staphylococcus Aureus has a substantial homology to sequences from other Gram positive bacteria. These transamidases may be used in the ligation procedures described herein. SrtA has a 31% sequence identity and 44% sequence similarity. This is the best alignment across the entire sequenced area of the S. Pyogenes open-reading frame. The sequence identity of A. naeslundii is 28% with the best alignment across the entire region. You will appreciate that different bacterial strains might have different sequences of a particular protein, and these sequences are excellent examples.

“In certain embodiments, a transamidase bearing 18%, 20%, or more sequence identities with the S. phytogenes, A. naeslundii and S. nutans, E. foecalis, or B. subtilis open-reading frame encoding an sortase can also be screened. Enzymes having transamidase activity comparable or equal to Srt A from S. aureas (e.g., comparable activity sometimes is 10% or more of Srt A activity or Srt A activity or Srt A activity or Srt A activity or Srt A activity or Srt A activity).

“In some embodiments of this invention, the sortase (SrtA) is a sortase. SrtA recognizes LPXTG (SEQID NO: 2), with common recognition motifs including LPKTG, LPATG, LPNTG (SEQID NO: 97), and LPATG. LPETG (SEQID NO: 4) may be used in some embodiments. However, it is possible to recognize motifs that are not part of this consensus. In some cases, the motif may include an?A? Some embodiments include an?A? rather than a?T? motif. at position 4, e.g. LPXAG (SEQID NO: 98), e.g. LPNAG (SEQID NO: 99). Some embodiments include an?A? Some embodiments of the motif include an?A? instead of a?G. at position 5, e.g. LPXTA (SEQID NO: 100), e.g. LPNTA (SEQID NO: 101). Some embodiments include a?G?” Some embodiments of the motif include a?G? rather than a?P? at position 2, e.g. LGXTG (SEQID NO: 102) or LGATG (SEQID NO: 101). Some embodiments include an?I? Some embodiments include an?I? instead of?L? at position 1, e.g. IPXTG (SEQID NO: 104), or IPNTG (105) or IPETG (106)

“It will be appreciated, that the terms’recognition motif? and?recognition sequence? are interchangeable. When referring to sequences recognized or mediated by transamidases, the terms?recognition motif? and?recognition sequencing? are interchangeable. The term “transamidase recognition series” is sometimes abbreviated?TRS? Sometimes abbreviated as?TRS’ herein.”

“In some embodiments, the sortase (SrtB) is a sortase B, e.g., one of S. aureus or B. anthracis or L. monocytogenes. Motifs recognized by sortases of the B class (SrtB) often fall within the consensus sequences NPXTX (SEQ ID NO: 154), e.g., NP[Q/K]-[T/s]-[N/G/s] (SEQ ID NO: 107), such as NPQTN (SEQ ID NO: 108) or NPKTG (SEQ ID NO: 109). Sortase B of S. anthracis or B. aureus cleaves the NPQTN or NPKTG (SEQID NO: 110), IsdC motif in the respective bacteria (see Marraffini L. and Schneewind O., Journal of Bacteriology 189 (17), pp. 6425-6436 2007, 2007). NSKTA (SEQID NO: 112) and NPQTG, (SEQID NO: 113), NAKTN(SEQID NO: 114), NPQSS [SEQID NO: 115] are other recognition motifs found in class B sortases. SrtB (SEQID NO: 112) recognizes certain motifs that lack P at position 2, and/or Q or K at place 3. (Mariscotti F, Garcia-Del Portillo FO, Pucciarelli G.) The sortase-B of Listeria monocytogenes recognizes various amino acids at position 2. J Biol Chem. 2009 Jan. 7. 2009 Jan. 7.

“In certain embodiments, the sortase may be a class C sortase. As a recognition motif, class C sortases can use LPXTG (SEQID NO: 2)

“In certain embodiments, the sortase can be classified as a class D sortase. This class of sortases is predicted to recognize motifs with a consensus string NA-[E/A/S/H]?TG (SEQ ID No: 118). (Comfort D supra). There have been class D sortases found in Streptomyces, Corynebacterium, Tropheryma wiplei, Tropheryma fusca and Bifidobacterium langhum. LPXTA or LAXTG may be used as recognition sequences for class D sortsases. This is e.g. for subfamilies 4 (SEQID NO: 100), and 5 (SEQID NO: 120). These enzymes, which process the motifs LPXTA and LAXTG, respectively, are processed by subfamilies-4 and 5. B. anthracis SortaseC, which is a type D sortase has been shown to specifically cleave LPNTA (SEQID NO: 123) in B. anthracis BasI (Marrafini) (supra).

“For a description of a sortase that recognizes QVPTGV [SEQ ID NO 124) motif, see Barnett, Scott (Barnett T C, Scott, J R. Differential Recognition Surface Proteins in Streptococcus Pyogenes by two Sortase Gene homologs. Journal of Bacteriology Vol. 184, No. 8, p. 2181-2191, 2002).”

“The invention allows for the use of sortases found within any gram-positive organism, such that those listed herein and/or in references (including databases) cited therein. The invention contemplates the use of sortases that are found in gram-negative bacteria, such as Colwellia psychorerythraea and Microbulbifer dergradans, Bradyrhizobium japonicum and Shewanella oneidensis. They recognize sequence motifs such as LP[Q/K]T[A/S]T. (SEQ ID NO. 121). A sequence motif LPXT[A/S] may be used to match the tolerance for variation at position 3 in sortases of gram positive organisms.

“The invention contemplates use of sortase recognition motifs from any of the experimentally verified or putative sortase substrates listed at bamics3.cmbi.kun.nl/jos/sortase_substrates/help.html, the contents of which are incorporated herein by reference, and/or in any of the above-mentioned references. LPKTG is the sortase recognition symbol in some embodiments. The sequence may include LPXT, SEQID NO. 144, LAXT(SEQID NO. 146), LPXA [SEQID NO. 155], LPDTA (SEQID NO. 72], SPKTG (SEQID NO. 77), LAATG (SEQID NO. 76), LAATG (SEQID NO. 78), LAHTG (SEQID NO. 79), LPETG (SEQID NO. 81), 93), NAKT (SEQID NO. 94), 97), NPQSEQID NO. 95) or NPQSEQID NO. 70), LPIT In some embodiments,?X is used. Any standard or nonstandard amino acid can be used in any typease recognition motif. Each variation is disclosed. In some embodiments, X can be selected from among the 20 most common amino acids in proteins found in living organisms. Some embodiments, such as those where the recognition motif for LPXTG is (SEQID NO: 2) or (SEQID NO: 144), X can be D, E A, N Q, K or R. In other embodiments, X is chosen from amino acids that naturally occur at position 3 in a sortase substrate. In some embodiments, X can be selected from K, E N, Q, and A in an LPXTG motif (SEQID NO: 2) or the LPXT motif (SEQID NO: 144) where the sortase type is A. Some embodiments select X from K, S. E, L. A. N in an LPXTG or LPXT motif (SEQID NO: 144) and use a class-C sortase.

“Some embodiments of a recognition sequence include one or more additional amino acid, e.g. at the N-terminus or C terminus. One or more amino acids, such as up to five amino acids, may be included in recognition sequences. These amino acids can have the identity of amino acid found immediately N-terminal or C-terminal (or both) to a 5-amino acid recognition sequence in a naturally occurring sortase substrate. These additional amino acids can provide context which may improve the recognition motif.

“Transamidase recognition sequence” is a term that may be used. It could refer to either a masked sequence or an unmasked sequence of transamidase recognition. Transamidase can recognize an unmasked transamidase sequence. A transamidase that recognizes an unmasked sequence of transamidase may have been previously mask, as discussed herein. A?masked transamidase sequence may be used in some embodiments. A sequence that cannot be recognized by a transamidase, but can be easily modified (?unmasked?) The sequence is then recognized by a transamidase. In some embodiments, at least one of the amino acids in a masked transamidase sequence recognition sequence contains a side chain that includes a moiety that inhibits (e.g. substantially prevents) recognition of the sequence of interest by a transamidase. The moiety is removed and the transamidase can recognize the sequence. In certain embodiments, masking can reduce recognition by as much as 80%, 90% or 95% (or more, depending on the specific embodiment). One example is that certain embodiments a threonine in a transamidase sequence such as LPXTG (SEQID NO: 2) has been phosphorylated. This renders it refractory for recognition and cleavage from SrtA. You can remove the masked recognition sequence by treating it with a phosphatase. This will allow it to be used in a SrtA catalyzed transamidation reaction.

“Modified Proteins Composed of Click Chemistry Handles”

“Some embodiments provide a modified protein (PRT) comprising a C-terminal click chemistry handle (CCH), wherein the modified protein comprises a structure according to Formula (I):\nPRT-LPXT-[Xaa]y-CCH??(I). (SEQ ID No: 158)

“Some embodiments provide a modified protein (PRT) comprising an N-terminal click chemistry handle (CCH), wherein the modified protein comprises a structure according to Formula (I) according to Formula (II):\nCHH-[Xaa]y-LPXT-PRT??(II). (SEQ ID No: 159)

Click chemistry reactions allow for covalently conjugating two proteins that have a click handle (e.g., one protein which has a click handle providing a nucleophilic group (Nu) and another protein with an electrophilic group (E) that reacts with the Nu group of a first click handle). Sharpless introduced click chemistry in 2001. It describes chemistry that is designed to create substances quickly and reliably using small units. (See, e.g. Kolb, Finn, and Sharpless Angewandte Chemistry International Edition (2001), 40:2004-2021; Evans. Australian Journal of Chemistry (2006) 60: 384-395). Joerg Lhann, Click Chemistry for Biotechnology and Materials Science 2009 John Wiley & Sons Ltd ISBN 978-0-470-69970-6 contains additional examples of click chemistry and reaction conditions and related methods that are useful according to this invention.

Click chemistry should be modular and broad in scope. It should also generate high chemical yields, inoffensive side products, and be physiologically stable. This concept has been applied to several reactions:

“(1) The Huisgen 1,3 polar cycloaddition (e.g. the Cu(I-catalyzed-stepwise variant), often referred to as the ‘click reaction? See Tornoe and al., Journal of Organic Chemistry (2002 67: 3057-3064). The most commonly used catalysts for the reaction are copper and ruthenium. Copper is used as a catalyst to form the 1,4-regioisomer, while ruthenium forms the 1,5-regioisomer.

“(2) Other cycloaddition reaction, such as Diels-Alder reaction

“(3) Nucleophilic Addition to Small Strapped Rings like Epoxides or Ziridines”

“(4) Nucleophilic Addition to Activated Carbonyl Groups; and

“(4) Addition reactions of carbon-carbon double and triple bonds.”

“Conjugation Of Proteins Via Click Chemistry Tools”

Click chemistry can be used to conjugate two proteins. The click chemistry hands of the proteins must be reactive with one another. For example, the reactive moiety in one click chemistry handle can react with the reactive moiety in the other click chemistry handle to form an ionic covalent bond. These reactive pairs of click chemical handles are well-known to those skilled in the art. They include but aren’t limited to those mentioned in ”

“TABLE I\nTABLE I: Exemplary click chemistry handles and reactions, wherein each ocurrence of R1, R2, is independently\nPRT-LPXT-[Xaa]y- (SEQ ID NO: 158), or -[Xaa]y-LPXT-PRT (SEQ ID NO: 159)< according to Formulas (I) and (II).\n1,3-dipolar cycloaddition\nStrain-promoted cycloaddition\nDiels-Alder reaction\nThiol-ene reaction"

“In some preferred embodiments click chemistry handles can be used that react to form covalent bond in the absence of any metal catalyst. These click chemistry handle are well-known to those skilled in the art. They include the click Chemistry beyond Metal Catalyzed Cycloaddition, Angewandte Chemie International Edition (2008) 48: 4900-4988.

“Another click chemistry handle that can be used in the methods for protein conjugation described herein is well-known to those skilled in the art. These click chemistry handles include the click chemistry reaction partners and groups and click chemistry hands described in [1] H. C. Kolb and M. G. Finn as well as K. B. Sharpless and Angew. Chem. 2001, 113, 2056-2075; Angew. Chem. Int. Ed. 2001, 40, 2004-2021. [2] a] C. J. Hawker and K. L. Wooley in Science 2005, 309-1205; b] D. Fournier, R. Hoogenboom and U. S. Schubert in Chem. Soc. Rev. 2007, 36, 1369-1380; c) W. H. Binder, R. Sachsenhofer, Macromol. Rapid Commun. 2007, 28, 15-54; d) H. C. Kolb, K. B. Sharpless, Drug Discovery Today 2003, 8, 1128-1137; e) V. D. Bock, H. Hiemstra, J. H. van Maarseveen, Eur. J. Org. Chem. 2006, 51-68. [3] a) V. O. Rodionov, V. V. Fokin, M. G. Finn, Angew. Chem. 2005, 117, 2250-2255; Angew. Chem. Int. Ed. 2005, 44, 2210-2215; b) P. L. Golas, N. V. Tsarevsky, B. S. Sumerlin, K. Matyjaszewski, Macromolecules 2006, 39, 6451-6457; c) C. N. Urbani, C. A. Bell, M. R. Whittaker, M. J. Monteiro, Macromolecules 2008, 41, 1057-1060; d) S. Chassaing, A. S. S. Sido, A. Alix, M. Kumarraja, P. Pale, J. Sommer, Chem. Eur. J. 2008, 14, 6713-6721; e) B. C. Boren, S. Narayan, L. K. Rasmussen, L. Zhang, H. Zhao, Z. Lin, G. Jia, V. V. Fokin, J. Am. Chem. Soc. 2008, 130, 8923-8930; f) B. Saba, S. Sharma, D. Sawant, B. Kundu, Synlett 2007, 1591-1594. [4] J. F. Lutz, Angew. Chem. 2008, 120, 2212-2214; Angew. Chem. Int. Ed. 2008, 47, 2182-2184. [5] a) Q. Wang, T. R. Chan, R. Hilgraf, V. V. Fokin, K. B. Sharpless, M. G. Finn, J. Am. Chem. Soc. 2003, 125, 3192-3193; b) J. Gierlich, G. A. Burley, P. M. E. Gramlich, D. M. Hammond, T. Carell, Org. Lett. 2006, 8, 3639-3642. [6] a) J. M. Baskin, J. A. Prescher, S. T. Laughlin, N. J. Agard, P. V. Chang, I. A. Miller, A. Lo, J. A. Codelli, C. R. Bertozzi, Proc. Natl. Acad. Sci. Sci. A. Johnson, J. M. Baskin, C. R. Bertozzi, J. F. Koberstein, N. J. Turro, Chem. Commun. 2008, 3064-3066; d) J. A. Codelli, J. M. Baskin, N. J. Agard, C. R. Bertozzi, J. Am. Chem. Soc. 2008, 130, 11486-11493; e) E. M. Sletten, C. R. Bertozzi, Org. Lett. 2008, 10, 3097-3099; f) J. M. Baskin, C. R. Bertozzi, QSAR Comb. Sci. 2007, 26, 1211-1219. [7] a) G. Wittig, A. Krebs, Chem. Ber. Recl. 1961, 94, 3260-3275; b) A. T. Blomquist, L. H. Liu, J. Am. Chem. Soc. 1953, 75, 2153-2154. [8] D. H. Ess, G. O. Jones, K. N. Houk, Org. Lett. 2008, 10, 1633-1636. [9] W. D. Sharpless, P. Wu, T. V. Hansen, J. G. Lindberg, J. Chem. Educ. 2005, 82, 1833-1836. [10] Y. Zou, J. Yin, Bioorg. Med. Chem. Lett. 2008, 18, 5664-5667. [11] X. Ning, J. Guo, M. A. Wolfert, G. J. Boons, Angew. Chem. 2008, 120, 2285-2287; Angew. Chem. Int. Ed. 2008, 47, 2253-2255. [12] S. Sawoo, P. Dutta, A. Chakraborty, R. Mukhopadhyay, O. Bouloussa, A. Sarkar, Chem. Commun. 2008, 5957-5959. [13] a] Z. Li, T. S. Seo and J. Ju Tetrahedron. 2004, 45, 3143-3146; b) S. S. van Berkel, A. J. Dirkes, M. F. Debets, F. L. van Delft, J. J. L. Cornelissen, R. J. M. Nolte, F. P. J. Rutjes, ChemBioChem 2007, 8, 1504-1508; c) S. S. van Berkel, A. J. Dirks, S. A. Meeuwissen, D. L. L. Pingen, O. C. Boerman, P. Laverman, F. L. van Delft, J. J. L. Cornelissen, F. P. J. Rutjes, ChemBio-Chem 2008, 9, 1805-1815. [14] F. Shi, J. P. Waldo, Y. Chen, R. C. Larock, Org. Lett. 2008, 10, 2409-2412. [15] L. Campbell-Verduyn, P. H. Elsinga, L. Mirfeizi, R. A. Dierckx, B. L. Feringa, Org. Biomol. Chem. 2008, 6, 3461-3463. [16] a. The Chemistry of the Thiol Group, Ed. : S. Patai, Wiley, New York 1974; b. A. F. Jacobine in Radiation Curing Polymer Science Technology III (Eds. : J. D. Fouassier, J. F. Rabek), Elsevier, London, 1993, Chap. 7, pp. 219-268. [17] C. E. Hoyle, T. Y. Lee, T. Roper, J. Polym. Sci. Part A 2008 (42, 5338-5338). [18] L. M. Campos, K. L. Killops, R. Sakai, J. M. J. Paulusse, D. Damiron, E. Drockenmuller, B. W. Messmore, C. J. Hawker, Macromolecules 2008, 41, 7063-7070. [19] a) R. L. A. David, J. A. Kornfield, Macromolecules 2008, 41, 1151-1161; b) C. Nilsson, N. Simpson, M. Malkoch, M. Johansson, E. Malmstrom, J. Polym. Sci. Part A 2008, 46, 1339-1348; c) A. Dondoni, Angew. Chem. 2008, 120, 9133-9135; Angew. Chem. Int. Ed. 2008, 47, 8995-8997; d) J. F. Lutz, H. Schlaad, Polymer 2008, 49, 817-824. [20] A. Gress, A. Voelkel, H. Schlaad, Macromolecules 2007, 40, 7928-7933. [21] N. ten Brummelhuis, C. Diehl, H. Schlaad, Macromolecules 2008, 41, 9946-9947. [22] K. L. Killops, L. M. Campos, C. J. Hawker, J. Am. Chem. Soc. 2008, 130, 5062-5064. [23] J. W. Chan, B. Yu, C. E. Hoyle, A. B. Lowe, Chem. Commun. 2008, 4959-4961. [24] a) G. Moad, E. Rizzardo, S. H. Thang, Acc. Chem. Res. 2008, 41, 1133-1142; b) C. Barner-Kowollik, M. Buback, B. Charleux, M. L. Coote, M. Drache, T. Fukuda, A. Goto, B. Klumperman, A. B. Lowe, J. B. McLeary, G. Moad, M. J. Monterio, R. D. Sanderson, M. P. Tonge, P. Vana, J. Polym. Sci. Part A 2006, 44. 5809-5831. [25] a) R. J. Pounder, M. J. Stanford, P. Brooks, S. P. Richards, A. P. Dove, Chem. Commun. 2008, 5158-5160; b) M. J. Stanford, A. P. Dove, Macromolecules 2009, 42, 141-147. [26] M. Li, P. De, S. R. Gondi, B. S. Sumerlin, J. Polym. Sci. Part A 2008, 46. 5093-5100. [27] Z. J. Witczak, D. Lorchak, N. Nguyen, Carbohydr. Res. 2007, 342, 1929-1933. [28] a) D. Samaroo, M. Vinodu, X. Chen, C. M. Drain, J. Comb. Chem. 2007, 9, 998-1011; b) X. Chen, D. A. Foster, C. M. Drain. Biochemistry 2004, 43. 10918-10929. c) D. Samaroo and C. E. Soll. L. J. Todaro. C. M. Drain. Org. Lett. 2006, 8, 4985-4988. [29] P. Battioni. O. Brigaud. H. Desvaux. D. Mansuy. T. G. Traylor. Tetrahedron Lett. 1991, 32, 2893-2896. [30] C. Ott, R. Hoogenboom, U. S. Schubert, Chem. Commun. 2008, 3516-3518. [31] a) V. Ladmiral, G. Mantovani, G. J. Clarkson, S. Cauet, J. L. Irwin, D. M. Haddleton, J. Am. Chem. Soc. 2006, 128, 4823-4830; b) S. G. Spain, M. I. Gibson, N. R. Cameron, J. Polym. Sci. Sci. Part A 2007, 45. 2059-2072. [32] C. R. Becer, K. Babiuch, K. Pilz, S. Hornig, T. Heinze, M. Gottschaldt, U. S. Schubert, Macromolecules 2009, 42, 2387-2394. The reaction was first described by Kurt Alder and Otto Paul Hermann Diels in 1928. Their work on the eponymous reaction earned them the Nobel Prize in Chemistry in Chemistry in 1950. [34] a) H. L. Holmes, R. M. Husband, C. C. Lee, P. Kawulka, J. Am. Chem. Soc. 1948, 70, 141-142; b) M. Lautens, W. Klute, W. Tam, Chem. Rev. 1996, 96, 49-92; c) K. C. Nicolaou, S. A. Snyder, T. Montagnon, G. Vassilikogiannakis, Angew. Chem. 2002, 114, 1742-1773; Angew. Chem. Int. Ed. 2002, 41, 1668-1698; d) E. J. Corey, Angew. Chem. 2002, 114, 1724-1741; Angew. Chem. Int. Ed. 2002, 41, 1650-1667. [35] a. H. Durmaz. O. Altintas. T. Erdogan. G. Hizal. U. Tunca. Macromolecules 2007. 40, 191-198. b. H. Durmaz. A. Dag. A. Hizal. G. Hizal. U. Tunca. J. Polym. Sci. Part A 2008, 46; c), A. Dag and H. Durmaz. c) E. Demir. G. Hizal. U. Tunca. J. Polym. Sci. Part A 2008 (46, 6969-6977); d) B. Gacal and H. Akat; D. K. Balta; N. Arsu; Y. Yagci; Macromolecules 2008 (41) 2401-2405; e). A. Dag. H. Durmaz. U. Tunca. G. Hizal., J. Polym. Sci. Sci. [36] M. L. Blackman, M. Royzen, J. M. Fox, J. Am. Chem. Soc. 2008, 130, 13518-13519. [37] It is important to note that trans-cyclooctene, which is seven orders of magnitude more reactive toward tetrazines than ciscyclooctene, is the most reactive. [38] N. K. Devaraj, R. Weissleder, S. A. Hilderbrand, Bioconjugate Chem. 2008, 19, 2297-2299. [39] W. Song, Y. Wang, J. Qu, Q. Lin, J. Am. Chem. Soc. 2008, 130, 9654-9655. [40] W. Song, Y. Wang, J. Qu, M. M. Madden, Q. Lin, Angew. Chem. 2008, 120, 2874-2877; Angew. Chem. Int. Ed. 2008, 47, 2832-2835. [41] A. Dag, H. Durmaz, G. Hizal, U. Tunca, J. Polym. Sci. Part A 2008, 46. 302-313. [42] a) A. J. Inglis, S. Sinnwell, T. P. Davis, C. Barner-Kowollik, M. H. Stenzel, Macromolecules 2008, 41, 4120-4126; b) S. Sinnwell, A. J. Inglis, T. P. Davis, M. H. Stenzel, C. Barner-Kowollik, Chem. Commun. 2008, 2052-2054. [43] A. J. Inglis, S. Sinwell, M. H. Stenzel, C. Barner-Kowollik, Angew. Chem. 2009, 121, 2447-2450; Angew. Chem. Int. Ed. 2009, 48, 2411-2414. All above-cited references are included herein for the disclosure of click chemical handles that can be installed on proteins using inventive concepts and methods.

“For example, some embodiments provide a first protein with a Cterminal strained alkyne, such as a Cterminal cyclooctyne, as click chemistry handles, and a second protein with a click chemistry handle that has a Cterminal azide. Both click chemistry handles can react with one another, and they can perform a strain-promoted cycleoaddition which results in the first protein and second protein being bound via a covalent bond. The two C-termini (or C to C) of the proteins are conjugated together in this example.

“In some embodiments, a first molecular, such as a protein, which contains a nucleophilic click chemical handle (Nu), selected from??SH,?OH,?NHRb5,?NH?NHRb5 or?N?NH is conjugated with a second molecular, for instance, a protein that includes the electrophilic partner click chemical handle (E).”

“To form a chimeric prot with a conjugated formula group:”

“Zb9 is?S,??,???,???N(Rb5),??NH?N (Rb5), or?”N??. In some embodiments the nucleophilic click chemical handle Nu is?SH while Zb9 is?S?. In some embodiments Nu is??OH and Zb9?O?. In some embodiments Nu is??NHRb5 while Zb9 is?N(Rb5)?. In some embodiments Nu is??NH?NHRb5 while Zb9 is?NH?N (Rb5)?. In some embodiments Nu is??NH and Zb9?N??. Rb5 can be hydrogen in certain embodiments.

“In some embodiments, Nu may be?SH?OH?NHRb5,?NH?NHRb5 or?N??NH.

“And the two molecules, such as two proteins, are combined to form a Chimeric Molecule, for instance a protein with Nu and E joining to form a conjugated Group of the formula:

Click here to view the patent on Google Patents.